Reading binary file to MultiScalarField

Hello all,
I was wondering if there are options for binary(flat) file parallel I/O in Palabos? I see that there are output options for writing vtk but I was wondering if I can directly use the plb_ifstream and plb_ofstream. I am interested particularly in reading from a binary file directly into MultiScalarField3D.

Thanks in advance.

Cheers,
C.S.N

Hello,

The command parallelIO::load(fileName, multiScalarField) reads a binary file (z-index is contiguous). The dimensions of the field plus some other info must be indicated in an XML file. I suggest you first write a scalar-field through the command parallelIO::save(multiScalarField, fileName), in order to understand the format of the XML file (you can read it in a text editor).

Cheers,
Jonas

Dear Jonas,
Thank you very much for the information and of course, also for Palabos :-). ParallelIO::save and load seem to be reading and writing binary files as you said but the order in which they write appears mixed up. For example when I read the output.dat file in the permeability tutorial and o/p it as binary the o/p is not correct for np > 2. It looks like the code is doing a checkerboard decomposition but when it writes, the order of output of the processors in the binary file appears incorrect. The code ie below in case anyone wants to reproduce the result. I am not sure whether this was intended. Any chance the wrong communicator is being passed somewhere or am I doing something blatantly wrong?

Thanks in advance

C.S.N

#include “palabos3D.h”
#include “palabos3D.hh”
#include
#include

using namespace plb;
using namespace std;

typedef double T;
#define DESCRIPTOR descriptors::D3Q19Descriptor

int main(int argc, char *argv[])
{
plbInit(&argc, &argv);

    std::string fNameIn  = argv[1];

    const plint nx = atoi(argv[2]);
    const plint ny = atoi(argv[3]);
    const plint nz = atoi(argv[4]);

    const T omega = 1.0;

    MultiBlockLattice3D<T,DESCRIPTOR> lattice(nx,ny,nz, new BGKdynamics<T,DESCRIPTOR>(omega));

    MultiScalarField3D<int> geometry(nx,ny,nz);
    plb_ifstream geometryFile(fNameIn.c_str());
    if (!geometryFile.is_open()) {
            pcout << "Error: could not open geometry file " << fNameIn << endl;
            return -1;
    }
    geometryFile >> geometry;
    pcout << "nx = " << lattice.getNx() << endl;
    pcout << "ny = " << lattice.getNy() << endl;
    pcout << "nz = " << lattice.getNz() << endl;
    parallelIO::save(geometry,argv[5]);

}

Hi,

Yes, you are right. Whenever a program is executed in parallel (or whenever for some other reason the multi-blocks are subdivided into several atomic-blocks), the data is written atomic-block after atomic-block, and there is no global linear ordering of the indices in the file. The reason for this is to enable Palabos to perform parallel Input/Output: every atomic-block is positioned at a given offset in the file and can be written at any time (concurrently) by the processor that holds the data.

You can of course still write a file with, say, 10 processors, and then read the data into a program running on 100 processors. Palabos just needs to know the data layout according to which the file was written. This information is provided in the .plb file.

If all you want to do is write a file, and later on read it again in Palabos, this process is transparent and you don’t need to care about the I/O internals. Now, in your case I understand that you would like to produce the binary input data with another software and read it into Palabos. Well, the simplest way to get this done is to choose a data layout with a single atomic-block. So, run first an example program on one processor (non-parallel), to be sure the data layout has a global linear order, and use the produced .plb file as a template for your own binary input files.

Things are a little bit tougher if your input data is too large to fit into the memory of a single processor (if you have, say, a terabyte of input data). The problem is that if you specify your data to consist of a single atomic-block, Palabos will go ahead and try to read it as a single block into one processor’s memory (before parallelizing it appropriately), and will run out of memory. In this case you must cut it up into smaller slices, or into any other kind of regular blocks, and specify the data layout in the .plb file. Again, let me inisist that this is just to avoid running out of memory during the input stage. This data decomposition has nothing to do with the actual parallelization in the program, because Palabos re-parallelizes the data after having read it.

I strongly suggest that you look at one of these .plb files to see how things work, because it’s really not difficult to understand. The .plb files are simple (XML) text files, and they essentially specify the dimension and position of each atomic-block, and its offset in the binary file. Nothing unexpected is going on here, at least for scalar- and tensor-fields. With block-lattices some kind of higher-order magic occurs, as the content of the dynamics objects is being serialized into the data files. Therefore, if you write your own binary input files, restrict yourself to scalar- and tensor-fields. If you want to read the populations of a lattice, read them into a tensor-field and then copy them into the lattice.

Good luck,
Jonas

Dear Jonas,
Thank you very much for the detailed explanation, it is exactly what I was looking for. I did look through the xml file before I posted back to the forum and actually noted the offsets in the file, that was what made me visualize the raw data. I was originally going to ask how the offsets worked. I initially had assumed that Palabos used MPI_Subarray for the collective I/O and so for the np >2 case the offsets didn’t appear correct. However, after your explanation things make a lot more sense.

I am just beginning to futz around with LB; I cannot stress enough how useful Palabos is turning out to be.

Thank you very much!!

Cheers,
C.S.N