2

When running valgrind to detect errors in an mpi application, I get the following error:

libmpi.so.0: cannot open shared object file: No such file or directory

I found out the following: Valgrind documentation (section 4.9.1) states that "The MPI functions to be wrapped are assumed to be in an ELF shared object with soname matching libmpi.so*. This is known to be correct at least for Open MPI and Quadrics MPI, and can easily be changed if required."

So since I'm using mpich2, it should actually use libmpich.so.1.0.

This can be seen in libmpiwrap.c:

#include "mpi.h"

/* Where are API symbols?
Open MPI      lib/libmpi.so,   soname = libmpi.so.0
Quadrics MPI  lib/libmpi.so,   soname = libmpi.so.0
MPICH         libmpich.so.1.0, soname = libmpich.so.1.0

A suitable soname to match with is therefore "libmpi*.so*".

My questions is: where and how do I configure this?

Wesley Bland
  • 8,816
  • 3
  • 44
  • 59
mort
  • 12,988
  • 14
  • 52
  • 97
  • 1
    How are you configuring/installing Valgrind? How about MPICH2? IIRC it just works if you specify `--with-mpicc=/path/to/mpicc` correctly to Valgrind's `configure`. Also ensure that your MPICH2 installation is configured with `--enable-shared`. – Dave Goodell May 06 '12 at 01:36
  • According to the valgrind documentation, the mpi installation is detected automatically. Where do I see which mpi installation valgrind uses? configure just tells me that mpicc has been found. – mort May 07 '12 at 09:38
  • 1
    Then just make sure that the correct MPI installation has been detected. The `--with-mpicc=` option just helps Valgrind find the right MPI installation. – Dave Goodell May 07 '12 at 13:40
  • I reinstalled valgrind and mpich with said options...still the same error! – mort May 07 '12 at 16:25
  • 1
    Desperate as I was, I simply renamed mibmpich.so to libmpi.so -> no more error. Nevertheless, valgrind doesn't print anything similar to "valgrind MPI wrappers 31901: Active for pid 31901 valgrind MPI wrappers 31901: Try MPIWRAP_DEBUG=help for possible options" as stated in the documentation. – mort May 07 '12 at 19:12

1 Answers1

1

I have had this issue myself. Hopefully this can point you in the right direction. Unfortunately, MPI is notoriously difficult to get working with valgrind and gdb.

Option 1: Find the right package.

On Fedora/RHEL/(rpm-based) systems, you can find the OpenMPI shared library in the valgrind-openmpi package. I could not find the MPICH version. I use MVAPICH, so this didn't help me. If you find the MPICH/MVAPICH versions, please comment so I can add it here.

PBone shows the contents as having the libmpiwrapper shared object.

Option 2: Build libraries from source. (What I did for MVAPICH)

I resorted to compiling valgrind from source, then copying the shared libraries. I wanted to keep the package-manager's version of valgrind, so I just matched versions of the source code and used the default GCC to be safe. You can probably use the latest and greatest, but I doubt you get much.

When building Valgrind, you need to make sure it finds the proper MPI installation. Check the output to verify the correct shared libraries and header files. I have a habit of installing multiple MPIs on a system, so this is something I learned the hard way. Spend some time on the autoconf/automake output. Luckily, the Valgrind maintainers do a good job of keeping the build pretty simple, so I didn't run into any major problems compiling.

Source Code http://valgrind.org/downloads/

Once you have it, the key then is setting up the env variables as they mention in the documentation.

Additional Notes - https://wiki.mpich.org/mpich/index.php/Support_for_Debugging_Memory_Allocation - https://fs.hlrs.de/projects/marmot/publications/paralleldebugging-ppt.pdf

msmith81886
  • 2,286
  • 2
  • 20
  • 27