Loading data from various .hdf5 files in parallel #3371

joernwichmann · 2024-01-30T01:58:31Z

joernwichmann
Jan 30, 2024

Hello everyone,

I am looking for help on the following topic. Let's say an oracle provides us with approximations to a stochastic PDE. It is stored in the following data format

Each level directory contains many .hdf5 files containing sample trajectories.

My job is to load the approximations, so that further processing steps can be taken. I have implemented a sequential loader that works fine. However, it only uses one core, drastically limiting the loading speed. I aim to increase the loading speed by using multiprocessing. But unfortunately my first attempt fails. Let me guide you through this attempt and where it fails.

First I define space and time discretisation classes encapsulating discretisation variables.

Next, a generic loader is defined by using the load function of firedrake.CheckPointFile.

The generic loader is used to define a specific loader using the storage directory structure.

Now, we can either use the specific loader sequentially or in parallel.

The sequential load just iterates over the requested samples.

The sequential loader works fine.

The parallel load uses the package concurrent.futures.

Unfortunately, when I run the parallel load, I get the following error message.

Q: Do you have any idea how to fix the parallel load?

connorjward · 2024-01-30T08:46:20Z

connorjward
Jan 30, 2024
Maintainer

Firedrake uses MPI to handle parallelism. To get parallel HDF5 loads of CheckpointFiles you simply need to to execute mpiexec -n #procs python myfile.py instead of python myfile.py. The sequential and parallel codes should be identical.

If you want the ranks to act independently you might want to use ensemble parallelism instead.

Does this answer your question?

3 replies

joernwichmann Jan 31, 2024
Author

Do I understand it correctly, mpiexec -n #procs python myfile.py executes myfile.py n-times without any communication inbetween? This is not what I need.

I want to have local loaders that load partial data. Afterwards the partial data should be sent to a master process that assembles the partial data to a global object. Can I use firedrake.Ensemble to communicate generic objects rather than firedrake.Function s?

connorjward Jan 31, 2024
Maintainer

mpiexec launches a number of processes that each run the same Python script. These processes are able to communicate with each other since they share an MPI communicator (mpi4py.MPI.COMM_WORLD). For a Firedrake simulation what this means is that each process only sees a local portion of the full mesh. This is done totally transparently to the user. You should just be able to wrap your serial script in an mpiexec and we will automagically distribute the mesh and handle all communication.

CheckpointFiles work in the same way. You can load the file on any number of processes and each process will then only see its partitioned section of the mesh.

I don't understand from your example whether you are trying to (a) load a single CheckpointFile shared between processes, or (b) load multiple CheckpointFiles, one per process. mpiexec will immediately give you (a), for (b) you would need ensemble parallelism.

joernwichmann Feb 2, 2024
Author

I am interested in (b') load multiple CheckPointFiles, many per process. I'll have a look at ensemble parallelism. Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loading data from various .hdf5 files in parallel #3371

{{title}}

Replies: 1 comment 3 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Loading data from various .hdf5 files in parallel #3371

joernwichmann Jan 30, 2024

Replies: 1 comment · 3 replies

connorjward Jan 30, 2024 Maintainer

joernwichmann Jan 31, 2024 Author

connorjward Jan 31, 2024 Maintainer

joernwichmann Feb 2, 2024 Author

joernwichmann
Jan 30, 2024

Replies: 1 comment 3 replies

connorjward
Jan 30, 2024
Maintainer

joernwichmann Jan 31, 2024
Author

connorjward Jan 31, 2024
Maintainer

joernwichmann Feb 2, 2024
Author