2

I am trying to make an mpi cluster using the following tutorial using ubuntu 14.04 and beagleboard xm board. The problem is that my client is beagleboard-xm which has a 32 bit armv7 processor. I have created an executable using mpic++ -o hello_world.c the contents of which are :

#include <mpi.h>
#include <stdio.h>

int main(int argc, char** argv) {
    // Initialize the MPI environment
    MPI_Init(NULL, NULL);

    // Get the number of processes
    int world_size;
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);

    // Get the rank of the process
    int world_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

    // Get the name of the processor
    char processor_name[MPI_MAX_PROCESSOR_NAME];
    int name_len;
    MPI_Get_processor_name(processor_name, &name_len);

    // Print off a hello world message
    printf("Hello world from processor %s, rank %d"
           " out of %d processors\n",
           processor_name, world_rank, world_size);

    // Finalize the MPI environment.
    MPI_Finalize();
} 

I can compile this on both ubuntu 14.04 (intel x86_64) and beagleboard-xm. However when i try to run in parallel using for example using " mpirun -host Server,board1 ./mpi_hello_world" i get

bash: orted: command not found
--------------------------------------------------------------------------
A daemon (pid 8349) died unexpectedly with status 127 while attempting
to launch so we are aborting.

I believe this is because the 32 bit executiable cannot be launched from my server. If i run "./mpi_hello_world on the board itself i get "-su: ./mpi_hello_world: cannot execute binary file: Exec format error" . The reverse happens if i compile on the board and try to run it on the server. So my question is how can i have a single executable which can run on both my server and board at the same time?

srai
  • 1,023
  • 2
  • 14
  • 27
  • Did you try to compile 2 executables: one for the x86 and one for the ARM? Then you could try running with something like this: `mpirun -host Server ./mpi_hello_world_x86 : -host board1 ./mpi_hello_world_arm`... Just an idea, not sure it would work. – Gilles Apr 12 '16 at 07:13
  • @Giles: Tried running "mpirun -np 1 -host Server ./mpi_hello_world_x86 : -np 1 -host board1 ./mpi_hello_world_xarm" but still getting bash: orted: command not found -------------------------------------------------------------------------- A daemon (pid 8349) died unexpectedly with status 127 while attempting to launch so we are aborting. The executables run fine on their machines individually. – srai Apr 12 '16 at 07:33
  • So you have (had) 2 different issues: 1/ the incompatibility between binaries (which my previous comment tried to address) and 2/ the configuration of your MPI environment. You have to first adjust the later before to try to run any sort of MPI code. As long as something like `mpirun -host Server,board1 hostname` won't return the names of both machines, you'll have a non-working MPI environment. – Gilles Apr 12 '16 at 07:42
  • @Giles : mpirun -host Server,board1 hostname running returns the same orted issue. I believe my environment is not setup properly to begin with. Any suggestions to what i might have missed? Thanks – srai Apr 12 '16 at 07:52

1 Answers1

2

Open MPI is complaining that it cannot find orted in the path on the remote host. You should modify the PATH variable in the shell profile script on the Beagle board to contain the path to the bin directory and the LD_LIBRARY_PATH variable to contain the path to the lib directory of the Open MPI installation. Or give the path to the remote installation with the --prefix option:

mpiexec --prefix /path/to/openmpi/on/beagle ...

Prefix sets both PATH and LD_LIBRARY_PATH on the remote system. Note that the last component of the library path is copied from the local installation, e.g. if Open MPI have its libraries in /path/to/openmpi/lib64 (since you host is 64-bit), then it will set LD_LIBRARY_PATH on the remote host to /path/to/openmpi/on/beagle/lib64, which will probably not be correct. This is usually of concern only when Open MPI is installed in the default system library locations, e.g. when installed from a package, and only on some Linux distributions, notably those based on RedHat. Ubuntu follows the Debian convention and puts 64-bit libraries in /usr/lib, therefore you should still be able to safely utilise the --prefix mechanism.

Since you want to run the program on hosts with different CPU types, you must rebuild Open MPI on both systems with --enable-heterogeneous in order to enable support for heterogeneous computing. Make sure that you are not building with --enable-orterun-prefix-by-default as it will require that Open MPI is installed in the exactly same location on both the host and the Beagle board.

You do not need to give the executables different names on both platforms if they don't share a common filesystem, but doing so helps prevent confusion.

To summarise:

mpiexec --prefix /path/to/openmpi/on/beagle \
        -H localhost -n 1 ./binary_x64 : \
        -H bealgexm -n 1 ./binary_arm

This still assumes that the current directory on the host, e.g. /home/user/mpitest also exists on the Beagle board. If not, provide the full path to the ARM executable in the second application context.

Note: Heterogeneous support in Open MPI is generally broken. Only very simple codes are likely to work.

Hristo Iliev
  • 72,659
  • 12
  • 135
  • 186
  • lliev: Thanks for the detailed explanation. I am currently using mpich instead of open mpi. Do you recommend i uninstall it and install open mpi instead and follow your guide or try to set the PATH and LD_LIBRARY_PATH. For mpich2 i have created a separate user mpiuser on both machines. – srai Apr 12 '16 at 08:17
  • @skrai, you are obviously not using `mpirun` from MPICH since `orted` is part of Open MPI's ORTE. It probably means that you have both Open MPI and MPICH installed on the host, which often leads to problems. Uninstall Open MPI if you are not using it. – Hristo Iliev Apr 12 '16 at 08:21
  • lliev : I can now run a cluster of 2 nodes of mpi with ubuntu 14.04 (x86_64) though the arm executable fails. Do you recommend i separate my application into 2 parts. one for arm and one for x86_64? The board needs to talk to server. – srai Apr 12 '16 at 09:12
  • It's entirely up to you. If the client and server (so to say) parts of the application are sufficiently different, it might make sense to split it into two different projects. – Hristo Iliev Apr 12 '16 at 12:12