to check if Channelflow 2.0 runs properly on your cluster, we might want to compare the MPI performances in this forum.
For this purpose, I suggest using the program “benchmark” in the “tools/” directory. Then, please provide similar information as this
@ EPFL cluster Fidis (1 node = 28 CPUs), on 4 nodes = 112 cores
tools/benchmark -nc -np0 8 -np1 14
reading a FlowField file with (Nx,Ny,Nz)=(1024,121,1024) gives
Average time/timeunit: 68.1176s
Looking forward to your numbers
Best,
Florian
PS: just to make sure, it is very important that such benchmarks are done using a code compiled in “release” mode
By the way, Channelflow’s performance depends critically on the bandwidth of inter-node communication. The above performance holds for Infiniband FDR connectors between nodes. On Garcrux we have Infiniband EDR connectors, and get
@ EPFL cluster Garcrux (1 node = 28 CPUs), on 4 nodes = 112 cores, EDR IB
tools/benchmark -nc -np0 8 -np1 14
reading a FlowField file with (Nx,Ny,Nz)=(1024,121,1024) gives
Average time/timeunit: 45.2445s
In this case, EDR performance = 2/3 FDR performance
The benchmark routine ends up giving a CFL=nan when run with a random initial field of resolution Nx,Ny,Nz=1024,121,1024. Thus, tests were conducted by measuring the time in the regular simulateflow.cpp with the default conditions of the routine.
For a variable “dt” (dt_avg ~ 0.012), the performance for resolution of Nx,Ny,Nz=1024,121,1024 on the cluster ADA @IDRIS, Paris, when run on 4 nodes ( 1 node = 32 cores ) = 128 cores, InfiniBand FDR10 Mellanox network (2 links per node) is Average Time / Time unit = 168.18 s.
For a fixed dt=0.001, the same case gives the result of Average Time/time unit = 40.45 min
For the cluster TURING @IDRIS and the same case, run on 64 nodes ( 1 node = 16 cores) = 1024, with a variable “dt” (dt_avg ~ 0.015) the performance is, Average Time / Time unit = 102.4 s