may be some parallelism or grid refinement problem

Hello,

I noticed some problem similar to that repported by jongan with the program “dipole.cpp” of the directory /palabos-v1.5r1/examples/showCases/gridRefinement2d/

With no change in the program (using the last palabos version)

Wen I run the program dipole.cpp with 5 cores :

#OAR -l /core=5,walltime=5:00:0

and resolution equal to 50

mpirun --prefix $PREF -np $NSLOTS -machinefile $OAR_NODEFILE ./dipole 50

all works fine.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Now when I run the program dipole.cpp with 6 cores :

#OAR -l /core=6,walltime=5:00:0

and resolution equal to 50

mpirun --prefix $PREF -np $NSLOTS -machinefile $OAR_NODEFILE ./dipole 50

I onbtain an an error in a file named :

dipole.80s-31187,node056.btr

containing :

dipole:31187 terminated with signal 11 at PC=455765 SP=7fffc2659ba8. Backtrace:
./dipole[0x455765]
./dipole[0x48f1c6]
./dipole[0x48d4ac]
./dipole[0x470a6e]
./dipole[0x4c5e18]
./dipole[0x4c8f3c]
./dipole[0x51b8fc]
./dipole[0x51b682]
./dipole[0x4b63d7]
./dipole[0x408672]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x38e1a1ed5d]
./dipole[0x407969]

the program goes on running but it is stuck : no more output.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Now if I change the size of the finest refinement box from at line 281 of the program dipole.cpp :

the original code is :

Box2D refinementLevel1(4*N/3,2*N,2*N/4+N/12,2*3*N/4-N/12); // refine near the wall where the collision happens

the change is

Box2D refinementLevel1(4*N/3-20,2*N,2*N/4+N/12,2*3*N/4-N/12); // refine near the wall where the collision happens

and running the program with 6 cores (the previously not working case)

#OAR -l /core=6,walltime=5:00:0

and resolution equal to 50

mpirun --prefix $PREF -np $NSLOTS -machinefile $OAR_NODEFILE ./dipole 50

now all works fine again.

Does anyone know where the problem is ?

Thanks a lot!