MPI vs serial, findsoln et al

Hi all,

I have been playing around with the MPI version. On installation, all tests pass, which is good news. However, I notice there is no test coverage (at least by default) for things like continuesoln and findsoln.

In fact, I find that if I converge a state I typically get a different answer if I build with MPI versus without MPI. I wonder if this is known about and has an explanation?

As a starting point, I’ve taken EQ1 at Re = 400 and run

findsoln -eqb -R 405 -T 20 -L eq1.nc

to converge to a new state at Re = 405.

If I compute the L2 distance with L2op between the states obtained with channelflow compiled as serial vs MPI, it is ~4.5e-8.

The converged solution in the case of the MPI build is always the same - regardless of how many MPI tasks there are (even if there’s just 1).

Dear Jake,

thank you a lot for your bug report and sorry for having to wait so long for a response. I was able to reproduce exactly what you describe. The converged states differ depending on the build type (serial vs. parallel). The fact that also searches with a single MPI task reproduce this offset means that we can exclude MPI communication as reasons for this bug. I guess the different internal data layout of the state vector on which the Newton operates could create this problem. Did you realize that without Stokes preconditioning (-L) the difference is one order of magnitude smaller? I do not understand yet what is going on and will come back to you after some investigations.

Did you realize that without Stokes preconditioning (-L) the difference is one order of magnitude smaller?

Hmmm, perhaps this points to the possibility that the small errors are coming from the GMRES implementation, i.e. because in the Stokes case it’s doing an order of magnitude more iterations.

First thing I’d suggest is to rule out the possibility of a difference in the arbitrary phase shift in x or z in the solution. Without symmetry restrictions the Newton-Krylov algorithm will generally include small drifts in the solution in x or z, since they neither hurt nor help the reduction of the residual.

For a quick check, compare the output of fieldprops -n -dg 16 on both fields and see if they have the same norms to many digits. If so that indicates the two solutions are really the same, just drifted a little in either/both x and z.

Or recompute the two solutions with symmetries enforced to fix the spatial phase, e.g.

findsoln  -eqb -R 405 -T 20 -L -symms sxyz-sztxz.asc eq1.nc

where sxyz-sztxz.asc specifies the symmetry group of the solution and has content

% 2
1  1  1 -1 0.5 0.5
1 -1 -1 -1 0.0 0.0

assuming you have eq1 in the phase where those are the correct symmetries.

1 Like

Ah that’s right, thanks! On fixing the symmetry the discrepancy drops to ~1e-14 at which point I suppose the rest may be accounted for by the MPI build reordering the arithmetic or something similar.

I never did quite get my head around how channelflow imposes these symmetries (and the file format). Is there a simple way to generate these files based on computing symmetry properties of an exisiting FlowField?

1 Like

Yes. The findsymmetries utility will search through a set of possible discrete symmetries and tell you which one a given field satisfies. E.g.

gibson@sophist$ findsymmetries EQ1.h5 
satisfied 4 symmetries to eps == 1e-06
  1  1  1  1    0       0
  1 -1 -1 -1    0       0
  1  1  1 -1    0.5     0.5
  1 -1 -1  1    0.5     0.5

The output is s sx sy sz ax az where sx represents inversion along the x axis, etc, and ax represents a phase shift in x of length ax*Lx. Findsymmetries gives you the whole symmetry group; you can pick out any two of the non-identities symmetries as generators of the group and enforce these during time-stepping or solution finding. E.g. create file sxyz-sztxtz.asc with contents

% 2
  1 -1 -1 -1    0       0
  1  1  1 -1    0.5     0.5

The run findsoln -eqb -R 405 -T 20 -L -symms sxyz-sztxz.asc eq1.nc and it’ll enforce those symmetries during Newton-Krylov search.

Findsymmetries works best when the field is already nearly aligned with the box, i.e. if there’s a z mirror symmetry, the plane of that symmetry should be very nearly z=0 or z=Lz/2. You can adjust the tolerance for nonalignment using the -e option to findsymmetries.

1 Like

This is all very helpful, thanks.

Just to check for my understanding - these .asc files are specifying discrete symmetries, but in your original reply you hinted that Channelflow fixes x and z shifts too if the -symms option is used. So to fix just the continuous symmetries for plane Couette one could presumably just use a file with the identity element, i.e.

% 1
1  1  1  1	0	0

That’s not quite how it works. To fix the z phase of a field, you need some kind of z mirror symmetry, essentially any symmetry with sz=-1. (Likewise for x). I’ll have to spell this out in some detail, I think.

The symmetries files specify symmetry operators in terms of parameters

s sx sy sz  ax  az

where the s parameters are +/-1 and ax, az are real numbers in [0,1]. Let sigma be a symmetry parameterized like that. The action of sigma on a velocity field is

sigma [u,v,w](x,y,z) = s [sx u, sy v, sz w](sx x + ax Lx, sy y, sz z + az Lz)

E.g. the symmetry operator sigma = 1 1 1 -1 0 0 has action

sigma [u,v,w](x,y,z) = [u, v, -w](x, y, -z)

We say a velocity field has a given symmetry if that field is invariant under the symmetry operator, i.e. sigma u = u (where we’re using u as shorthand for [u,v,w](x,y,z)). Thus a velocity field has symmetry sigma = 1 1 1 -1 0 0 if

[u,v,w](x,y,z) = [u, v, -w](x, y, -z)

That’s z-mirror symmetry about the z=0 plane. This symmetry fixes the phase of the field in z.

The example I posted previously fixes both the x and z phase because it has symmetries with both sx=-1 and sz=-1.

1 Like

I see, so setting a reflection symmetry anchors the solution to the relevant symmetry axis and thereby fixes the continuous symmetry as a byproduct. Should have picked up on this - thanks, it makes sense to me now.

1 Like