1

I am wondering if there is a way of measuring the time which is spent in MPI calls during runtime. Thus, I can use it for calculating a new load balancing.

I know how to profile and trace the program using some tools from OpenMPI or Intel, but those are all use after runtime. Furthermore, I have tried FPMPI, which was not a success because the last release is not able to be built.

Measuring "by hand" does not make any sense in my application because it is way to big :/

1 Answers1

0

First of all, do you really need to profile low-level communication such as MPI? Can't you simply time your high-level routines instead?

Anyway, it is pretty easy to write your own MPI profiler. Practically all MPI libraries (Open MPI included) export their functions (e.g., MPI_Send) as weak aliases of the same function symbols with prefix P (e.g., PMPI_Send). All you need to do is define your own functions with the same prototypes as the ones in the MPI library. Inside, update your call counters, start the timers, then call the original MPI function with a P prefix, and upon return stop the timers:

extern "C"
int MPI_Send(void *buf, int count, MPI_Datatype dtype, int dest, int tag, MPI_Comm comm) {
   // Update call counters and start the timer
   calls["MPI_Send"]++;
   timers["MPI_Send"].start();

   // Call the original MPI function
   int result = PMPI_Send(buf, count, dtype, dest, tag, comm);

   // Stop the timer
   timers["MPI_Send"].stop();

   return result;
}

The extern "C" part is important, otherwise you won't override the correct weak symbol if you write in C++.

This ability to override symbols from the MPI library is standardised - see #14.2 Profiling Interface in the current version of the standard.

Hristo Iliev
  • 72,659
  • 12
  • 135
  • 186
  • Tools such as mpiP can already do that for you http://software.llnl.gov/mpiP/ That being said, I was under the impression the question was how to get this information in real time. – Gilles Gouaillardet May 09 '20 at 10:01
  • Exactly why I'm showing how to write one's own MPI profiler. A load balancing module may periodically consult the timers and call counters and make the appropriate adjustments. But then again, I think MPI profiling is a bit too low of a level to instrument for such purposes. – Hristo Iliev May 09 '20 at 10:02
  • SOSflow might help https://github.com/cdwdirect/sos_flow/wiki/TAU-Integration-Example-with-MPI,-ADIOS-support – Gilles Gouaillardet May 09 '20 at 10:07
  • Oh, nice! I didn't know about SOSflow. But it seems a bit of an overkill when the goal is simply load balancing. – Hristo Iliev May 09 '20 at 10:11
  • Yes, I need to. I already found the same solution you have sent me :D But thanks anyway, this is good way to do it! – Jeremy Harisch May 11 '20 at 12:26