Recommendations for -np0 and -np1

john.gibson · October 10, 2018, 1:58pm

I’m running channelflow on a cluster with 28 compute nodes. Each node has two 12-core Xeon E5-2680 CPUS.

I envision two scales of channelflow computations:

small, running on one node with 24 cores, 128 x 129 x 128 discretization
large, running on, say, sixteen nodes with 384 cores, 1024 x 129 x 1024 discretization

What are decent starting point for values of -np0 and -np1 for these two simulations?

florian.reetz · October 10, 2018, 3:10pm

Hello John,

there is only one hard requirement for the MPI distribution which is mod(Nx,np0)=0. The reason is the way the FFTW transform plans are set up. If you do not respect this, you will see a runtime error. In your case, any power of 2 will work for np0.

I typically use a distribution which is close to equally distributed because makes the data chunks most likely to be the same on each process (“pencil distribution”). In your case, this would be (np0,np1)=(4,6) for small, and (np0,np1)=(16,24) for large. However, it might be worth to test if a “slab distribution” is faster in your case. This means that only one dimension is distributed, let’s say (np0,np1)=(1,384) for large. The advantage is that you save cost for communication. The disadvantage is that it is more likely to distribute the data unequally over the tasks. np1 divides the x-dimension in physical space and the z-dimension in spectral. If your numerical domain is very long but narrow, a “slab distribution” is probably a bad idea but your domain has equal sides and it might be more performant.

laurette · May 27, 2019, 9:23am

On the computer and system that we use (IBM x3750) we find that (np0,np1)=(1,np) is optimal for our (Lx,Lz)=(10,40) domains, at least up to np=64.

john.gibson · September 5, 2019, 4:00pm

Laurette: I experience the same on my Intel Xeon CPU E5-2680 cluster. (np0, np1) = (1,np) is optimal for all np I’ve tested, up to 64.

Topic		Replies	Views
Comparison of MPI performance on different clusters	2	1080	June 21, 2019
Using multiple processes Usage	2	840	March 4, 2021
Parallelization of the example scripts Programming	4	820	August 9, 2021
MPI vs serial, findsoln et al Usage	8	1432	October 31, 2018
Memory leak in openmpi-2.1.x (ubuntu 18.04 package) Installation	4	1256	May 29, 2019

Recommendations for -np0 and -np1

Related topics