Parallel programming with C++ and MPI -- same executable possible?

Question

I would like to create a program that would work with both MPI and without at run time and I have reached a point where I think that is not possible.

For example, an MPI program might look like this:

#include <mpi.h>

int main(int argc, char* argv[])
{
  MPI_Init(&argc, &argv);

...

  MPI_Finalize ();
  return 0;
}

But in order to allow users who don't have MPI installed to compile this, I would have to wrap it around #if's:

// Somewhere here, pull in a .h file which would set HAVE_MPI to 0 or 1

#if HAVE_MPI
#include <mpi.h>
#endif

int main(int argc, char* argv[])
{
#if HAVE_MPI
  MPI_Init(&argc, &argv);
#endif
...
#if HAVE_MPI
  MPI_Finalize ();
#endif
  return 0;
}

Until now, I think I'm correct? If this program compiled to a.out with HAVE_MPI set to 1, it would be run as: mpirun -np 4 ./a.out or mpirun -np 1 ./a.out. It could never be run as ./a.out, because even if it isn't run within mpirun, it would call the MPI_* functions.

I guess to achieve what I would like -- an executable that could be run with and without mpirun is not possible. And the only option is to create two separate executables -- one with HAVE_MPI set to 0 and another set to 1. Is this correct or am I missing something about how C++ programs using MPI are implemented?

As a side-note, I don't think this is true with shared memory parallelization (i.e., with OpenMP). The same executable should work with a single or multiple threads.

Is my understanding correct?

To pick up on your side note - yes the same executable should run on 1 or multiple threads. But don't be surprised to find that on 1 thread the executable is slower than the equivalent non-OpenMP code. Using OpenMP imposes some *parallel overhead*, part of which is imposed on all codes and does not vary with the number of threads used. — High Performance Mark, Aug 19 '22 at 09:34
after reading the question I am not sure if you know that `#ifdef` cannot help to have both options in one executable or it that is the question. `#ifdef` is evaluated by the preprocessor before actual compilation starts — 463035818_is_not_an_ai, Aug 19 '22 at 09:40
@HighPerformanceMark Ah! I see your point. So, it makes sense to also use a `#if HAVE_OPENMP` to do conditional compilation. Which means the same problem as my MPI problem -- I really should have two separate executables and can't have one that "does both". Thank you for the insight! — Ray, Aug 19 '22 at 09:43
@463035818_is_not_a_number Yes, I'm aware `#ifdef` (and `#if`) are processed by the preprocessor. That's what I'm stuck on. I guess I'm correct, then? It is "impossible" to create an executable that works with and without `mpirun`, since it has to be determined at compile-time. There isn't a "trick" by experts that I'm missing out on? — Ray, Aug 19 '22 at 09:44
Can you check the command line arguments and determine if the program was run via `mpirun`? You could only call MPI functions when the arguments are correct. For example: No arguments = don't use MPI. As for compilation, you could supply MPI headers with your project, so that everyone can compile it. The license seems to allow that. — VLL, Aug 19 '22 at 09:50
Most MPI implementations allow a program to be run in singleton mode: simply run `a.out`, and `MPI_Init()` will success and consider this is a single task job. To be perfectly clear, you will not need `mpirun a.out` but you will at least need `libmpi.so` and its dependencies. — Gilles Gouaillardet, Aug 19 '22 at 10:04
you can always write `int main() { if (MPI_enabled) { ...code using mpi...} else { ...code not using mpi...}` but i guess thats not what you want either ;) — 463035818_is_not_an_ai, Aug 19 '22 at 10:33
@463035818_is_not_a_number Ah... True. I can do that as well! Thank you for mentioning it. And no, it isn't a matter of what I want or not want -- I think what I knew wasn't enough so any advice will help. So, thank you (and others) for the replies! — Ray, Aug 19 '22 at 13:45

score 2 · Accepted Answer · answered Aug 19 '22 at 11:09

A compiled MPI program needs MPI libraries at runtime in addition to the mpirun call (not required by all MPI implementations for 1 process nor in all cases). Thus, to run MPI function only in some cases at runtime without having a dependency to MPI, the potion of the code using MPI needs to be dynamically loaded. This can be done by putting MPI-related functions in shared library components and loading them at runtime regarding your needs.

The same thing is also true for OpenMP: an OpenMP code requires OpenMP compiler flags like -fopenmp creating a dependency to an OpenMP implementation required at runtime. That being said, the common GOMP (GCC) runtime or the IOMP runtime (ICC/Clang) are often installed on most Linux computing machines by default (especially the former since GCC is generally the default compiler and is bundled with GOMP).

Using shared library help you to easily switch between two parallel implementations: MPI and OpenMP. That being said, it means the initialization, finalization and communication collectives must be wrapped in an external library compiled separately of your program. The library functions can be loaded dynamically at startup time or manually at runtime using dlopen (see this related post).

As pointed out in the comments, using preprocessor directives are parsed at compile time and this is why you need to rebuild your program twice when using them. This is not required with shared libraries (though all diverging parts to be executed at runtime needs to be compiled separately).

Ah! Thank you for this! I only had a vague idea about **dynamically loaded** libraries, but I certainly would not have figured out that it was the solution to my problem! Thank you for posting this answer and mentioning `dlopen`! I think I've seen programs do what I had asked; just no idea how they were written. Thank you!! — Ray, Aug 19 '22 at 13:42

Parallel programming with C++ and MPI -- same executable possible?

1 Answers1