Jlabos

Hi,

I tried to test the Jlabos toolkit on a machine that does not contain Jlabos installation …
I managed to compile the examples .jar files using scons compilor…
I want that the generated jar files would be a machine independent binary… meaning, it could executed on another machine provided we specify the .so dependencies …

machine where Jlabos is installed: /home/benbelgacem/palabosjlabos/palabos-v1.0r1
machine without jlabos installation: /home/mohamed/

ls /home/mohamed/muscle_mpi/plb

drwxrwxr-x 2 mohamed mohamed 4096 Jan 6 10:22 jlabos
-rwxrwxr-x 1 mohamed mohamed 7661 Jan 6 10:23 lib_core.so
-rwxrwxr-x 1 mohamed mohamed 7661 Jan 6 10:23 lib_double_block.so
-rwxrwxr-x 1 mohamed mohamed 7661 Jan 6 10:23 lib_double_d2q9.so
-rwxrwxr-x 1 mohamed mohamed 7661 Jan 6 10:23 lib_double_d3q19.so
-rwxrwxr-x 1 mohamed mohamed 7661 Jan 6 10:23 lib_float_block.so
-rwxrwxr-x 1 mohamed mohamed 7661 Jan 6 10:23 lib_float_d2q9.so
-rwxrwxr-x 1 mohamed mohamed 7661 Jan 6 10:23 lib_float_d3q19.so
-rwxrwxr-x 1 mohamed mohamed 7661 Jan 6 10:23 lib_int_block.so
-rw-rw-r-- 1 mohamed mohamed 19116266 Jan 6 11:37 libplb_mpi.a
-rwxrwxr-x 1 mohamed mohamed 2346049 Jan 6 11:37 libplb_mpi.so

This a print stack of such execution:

Exception in thread “main” java.lang.UnsatisfiedLinkError: /home/mohamed/muscle_mpi/plb/lib_core.so: /home/benbelgacem/palabosjlabos/palabos-v1.0r1/lib/libplb_mpi.so: cannot open shared object file: No such file or directory
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1807)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1732)
at java.lang.Runtime.loadLibrary0(Runtime.java:823)
at java.lang.System.loadLibrary(System.java:1028)
at jlabos.JlabosBase.(JlabosBase.java:7)
at jlabos.JlabosBase.getSingletonObject(JlabosBase.java:26)
at Cavity2d.O_i(Cavity2d.java:96)
at Cavity2d.execute(Cavity2d.java:61)
at muscle.core.kernel.RawKernel.executeDirectly(RawKernel.java:295)
at utilities.MpiSlaveKernelExecutor.main(MpiSlaveKernelExecutor.java:23)

It seems that some Call of System.LoasLibrary(…) is hardcoded and is related to the compilation (/home/benbelgacem/palabosjlabos/palabos-v1.0r1/lib/libplb_mpi.so: cannot open shared object file)

It’s not the System.LoadLibrary argument which are hardcoded but the dependencies inside the .so library.

Try to add the so path to LD_LIBRARY_PATH or ldconfig.

Thanks for the response,

I don’t have sudo privileges on the cluster (Scientific linux) where i want to run the example & LD_LIBRARY_PATH does not resolve the pb. However, i was obliged to modify a bit the postprocess.sh script and recompile the code since scientific linux throws “ELF file OS ABI invalid” issue related to a shared libraries compiled with ubuntu 64.

I have tried to run the Cavity2d.jar using MPI over more than one cluster node…
Below, my PBS job description file:

#PBS -N MPI_MUSCLE_TEST
#PBS -l nodes=2:ppn=2

End of arguments to qsub

#Load mpi-module (for example openmpi in this case on Grass cluster)
module load openmpi

cd working directory

cd /home/mohamed/palabosjlabos/jlabos/examples/Cavity2d

mpirun -np 4 java -jar -Djava.library.path=…/…/src/jlabos/plb:/usr/lib Cavity2d.jar
#End of script (make sure line before this gets run)

It seems that there is a BUG:

terminate called after throwing an instance of ‘std::bad_alloc’
what(): St9bad_alloc
[grass1:26040] *** Process received signal ***
[grass1:26040] Signal: Aborted (6)
[grass1:26040] Signal code: (-6)
[grass1:26040] [ 0] /lib64/libpthread.so.0 [0x387c60eb10]
[grass1:26040] [ 1] /lib64/libc.so.6(gsignal+0x35) [0x387ba30265]
[grass1:26040] [ 2] /lib64/libc.so.6(abort+0x110) [0x387ba31d10]
[grass1:26040] [ 3] /usr/java/jdk1.6.0_29/jre/…/lib/amd64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x114) [0x3914ebed14]
[grass1:26040] [ 4] /usr/java/jdk1.6.0_29/jre/…/lib/amd64/libstdc++.so.6 [0x3914ebce16]
[grass1:26040] [ 5] /usr/java/jdk1.6.0_29/jre/…/lib/amd64/libstdc++.so.6 [0x3914ebce43]
[grass1:26040] [ 6] /usr/java/jdk1.6.0_29/jre/…/lib/amd64/libstdc++.so.6 [0x3914ebcf2a]
[grass1:26040] [ 7] /usr/java/jdk1.6.0_29/jre/…/lib/amd64/libstdc++.so.6(_Znwm+0x79) [0x3914ebd239]
[grass1:26040] [ 8] /home/mohamed/palabosjlabos/palabos-v1.0r1/lib/libplb_mpi.so(_ZNSt6vectorIcSaIcEE14_M_fill_insertEN9__gnu_cxx17__normal_iteratorIPcS1_EEmRKc+0x22d) [0x2aaab306596d]
[grass1:26040] [ 9] /home/mohamed/palabosjlabos/palabos-v1.0r1/lib/libplb_mpi.so(_ZN3plb20RecvPoolCommunicator14receiveDynamicEi+0x261) [0x2aaab31368b1]
[grass1:26040] [10] /home/mohamed/palabosjlabos/palabos-v1.0r1/lib/libplb_mpi.so(_ZN3plb20RecvPoolCommunicator14receiveMessageEib+0x51) [0x2aaab3136091]
[grass1:26040] [11] /home/mohamed/palabosjlabos/palabos-v1.0r1/lib/libplb_mpi.so(_ZNK3plb27ParallelBlockCommunicator2D11communicateERNS_24CommunicationStructure2DERKNS_12MultiBlock2DERS3_NS_5modif6ModifTE+0x30d) [0x2aaab313390d]
[grass1:26040] [12] /home/mohamed/palabosjlabos/palabos-v1.0r1/lib/libplb_mpi.so(_ZNK3plb27ParallelBlockCommunicator2D17duplicateOverlapsERNS_12MultiBlock2DENS_5modif6ModifTE+0x174) [0x2aaab3133014]
[grass1:26040] [13] /home/mohamed/palabosjlabos/palabos-v1.0r1/lib/libplb_mpi.so(_ZN3plb12MultiBlock2D17duplicateOverlapsENS_5modif6ModifTE+0x20) [0x2aaab30792b0]
[grass1:26040] [14] /home/mohamed/palabosjlabos/palabos-v1.0r1/lib/libplb_mpi.so(_ZN3plb19PeriodicitySwitch2D9toggleAllEb+0x22) [0x2aaab30749f2]
[grass1:26040] [15] /home/mohamed/palabosjlabos/jlabos/src/lib/libplbwrapLattice_d2q9_double_mpi.so(_ZN3plb27generateMultiBlockLattice2DIdNS_11descriptors14D2Q9DescriptorEEEPNS_19MultiBlockLattice2DIT_EERKNS_5Box2DEPKNS_8DynamicsIS4_EE+0x9f) [0x2aaaba8fa7af]
[grass1:26040] [16] /home/mohamed/palabosjlabos/jlabos/src/lib/libswig_double_d2q9.so(Java_jlabos_double_1d2q9JNI_double_1D2Q9Descriptor_1generateMultiBlockLattice+0x16) [0x2aaabd7446f6]
[grass1:26040] [17] [0x2aaaab26bb0c]
[grass1:26040] *** End of error message ***
[grass2.man.poznan.pl][[50143,1],0][btl_tcp_frag.c:214:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)

mpirun noticed that process rank 2 with PID 26040 on node grass1.man.poznan.pl exited on signal 6 (Aborted).

dx : 0.0033333333333333335
nu : 0.06
tau : 0.6799999999999999
301
301
Internal structure of the 301-by-301 multi-block:
Number of blocks in multi-block:4
Smallest atomic-block: [151,300, 151,300]
Largest atomic-block: [0,150, 0,150]
Number of allocated cells: 0.090601million
Percentage of allocated cells in multi-block: 100.0

Before trying to compile a fresh copy of jlabos svn on a cluster where i don’t have sudo previligies, I
[ul]
[li] …
[/li][li] …
[/li][/ul]
installed local swig
added the path to java linux include in the file. ~JLABOSS_HOME/src/MakeFile : includePaths = /usr/java/jdk1.6.0_29/include/ /usr/java/jdk1.6.0_29/include/linux/
Called make

The execution trace is:

cd compilePalabos; make
make[1]: Entering directory /mnt/auto/people/plgmohamed/plb/trunk/jlabos/src/compilePalabos' python /people/plgmohamed/plb/palabos-v1.0r1//scons/scons.py -j 4 -f /people/plgmohamed/plb/palabos-v1.0r1//SConstruct palabosRoot=/people/plgmohamed/plb/palabos-v1.0r1/ projectFiles="dummyMain.cpp" precompiled=true optimize=true debug=false profile=false MPIparallel=true SMPparallel=false usePOSIX=true serialCXX=g++ parallelCXX=mpicxx dynamicLibrary=true compileFlags="-Wl,--no-as-needed -Wl,-Bsymbolic -pthread" linkFlags="-Wl,-Bsymbolic -pthread" optimFlags="-O3" debugFlags="-g" profileFlags="-pg" libraryPaths="" includePaths="/usr/lib/jvm/java-6-openjdk/include/" libraries="" scons: Reading SConscript files ... scons: done reading SConscript files. scons: Building targets ... scons:.’ is up to date.
scons: done building targets.
make[1]: Leaving directory `/mnt/auto/people/plgmohamed/plb/trunk/jlabos/src/compilePalabos’
bash ./preprocess /people/plgmohamed/plb/trunk/jlabos/src
~/plb/trunk/jlabos/src/swig ~/plb/trunk/jlabos/src
Swig-file preparation for module core
./preprocess: line 12: java_heap.i: No such file or directory