-1

How do I check that a program is using MPI when it runs? Specifically, how can I verify the program is running on multiple processors? Also, how can I figure out if my program is correctly running across multiple nodes?

Rob Latham
  • 5,085
  • 3
  • 27
  • 44
Coheen
  • 39
  • 7
  • Are you asking about the actual physical cores it is executed on? Within a single node, or across multiple nodes? Distinct nodes could telly you their hostnames, within the node it probably depends on your operating system. What do you mean by "applying MPI run correctly"? You use `mpirun` or `mpiexec` to start up multiple processes, each executing your program. – haraldkl Sep 25 '15 at 00:26
  • Yes, I am asking about the actual physical cores Within across multiple nodes. I use mpirun. – Coheen Sep 25 '15 at 19:51
  • So what do you mean by "applying MPI run correctly"? Wether it spawns across the network? See the answer from Pooja Nilangekar for that. Or what are you looking for? It's not quite obvious. – haraldkl Sep 25 '15 at 19:54

2 Answers2

0

I am assuming you're trying to figure out which processor/host is the MPI process running on.

You can use the MPI_Get_processor_name function to print the processor name.

Here is what your code will look like.

#include <mpi.h>
#include <stdio.h>

int main(int argc, char **argv)
{
    int rank, max_len;
    char processorname[MPI_MAX_PROCESSOR_NAME];
    MPI_Init(&argc,&argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Get_processor_name(processorname,&max_len);
    printf("Hello world!  I am process number: %d on processor %s\n", rank, processorname);
    MPI_Finalize();
    return 0;
}

So now to compile the program use mpicc -o hello_world hello_world.c. To run the program use mpirun -np 4 -f machinefile ./hello_world. This will run the program in 4 different processors mentioned in your machinefile.

Pooja Nilangekar
  • 1,419
  • 13
  • 20
  • Actually this is not I am asking. I am asking exactly if I run any distributed program using mpirun across multiple processors, and while the program is running, then how can I test if the program is running in distributed way correctly. – Coheen Sep 25 '15 at 19:54
  • It is impossible to prove full correctness of most non-trivial programs. You can only test certain functions of your program and see if it behaves as you expect. – Vladimir F Героям слава Sep 25 '15 at 23:07
  • Okay so for a linux system, you use [htop](http://hisham.hm/htop/) to keep track of the processes running, the status of the threads (Running, Sleeping, Zoombie, etc.), the CPU utilization, memory utilization, etc. So you could ssh into each of the machines in the cluster and run htop to check if the process is running or not. As far as the correctness is concerned, I don't understand what exactly you wish to check, You could use a logger and check the log to figure out whether or not the intermediate values are what you expect. – Pooja Nilangekar Sep 27 '15 at 07:23
0

You didn't tell us, what you are actually looking for. Your question is unclear and ambiguous, it would be great if you could improve it. That being said, I guess you would like to know wether your processes are actually executed by distinct CPU cores.

First of all, Pooja Nilangekar explained a method to verify the distribution across a network. Now within a single node, it most likely depends on the systems you are running on. If it is a Linux, you could for example make use of the /proc filesystem, and check the status of the current process in /proc/self/. This pseudo filesystem offers a file stat, which contains a field processor showing the cpu_id, this process was last run on. Maybe, also check /proc/self/status for the cpus, the process is allowed to run on. It might be that MPI or your scheduler puts restrictions on this for each process. Together with the node information from the answer of Pooja Nilangekar, you can thereby obtain the running information for each process.

If you can not modify the sources, to have each process reporting where it is running, I think, the easiest way to see which cores are utilized would be top, maybe also have look at this blog on How do I find out Linux CPU utilization?, which also mentions mpstat and sar.

haraldkl
  • 3,809
  • 26
  • 44