I'm currently trying to understand mpi4py. I set mpi4py.rc.initialize = False
and mpi4py.rc.finalize = False
because I can't see why we would want initialization and finalization automatically. The default behavior is that MPI.Init()
gets called when MPI is being imported. I think the reason for that is because for each rank a instance of the python interpreter is being run and each of those instances will run the whole script but that's just guessing. In the end, I like to have it explicit.
Now this introduced some problems. I have this code
import numpy as np
import mpi4py
mpi4py.rc.initialize = False # do not initialize MPI automatically
mpi4py.rc.finalize = False # do not finalize MPI automatically
from mpi4py import MPI # import the 'MPI' module
import h5py
class DataGenerator:
def __init__(self, filename, N, comm):
self.comm = comm
self.file = h5py.File(filename, 'w', driver="mpio", comm=comm)
# Create datasets
self.data_ds= self.file.create_dataset("indices", (N,1), dtype='i')
def __del__(self):
self.file.close()
if __name__=='__main__':
MPI.Init()
world = MPI.COMM_WORLD
world_rank = MPI.COMM_WORLD.rank
filename = "test.hdf5"
N = 10
data_gen = DataGenerator(filename, N, comm=world)
MPI.Finalize()
which results in
$ mpiexec -n 4 python test.py
*** The MPI_Barrier() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort. [eu-login-04:01559] Local abort after MPI_FINALIZE started completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** The MPI_Barrier() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort. [eu-login-04:01560] Local abort after MPI_FINALIZE started completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
-------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
*** The MPI_Barrier() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort. [eu-login-04:01557] Local abort after MPI_FINALIZE started completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
-------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was:
Process name: [[15050,1],3] Exit code: 1
--------------------------------------------------------------------------
I am a bit confused as to what's going on here. If I move the MPI.Finalize()
to the end of the destructor, it works fine.
Not that I also use h5py which uses MPI for parallelization. So I have a parallel file IO here. Not that h5py needs to be compiled with MPI support. You can easily do that by setting up a virtual environment and running pip install --no-binary=h5py h5py
.