0

Hi im trying to open libraries (openMPI & IntelMPI) (.so) files to list all of thoses present in my system and get their version.

so i made some code to list of versions of openMPI and IntelMPI present on my system. the thing is that for IntelMPI 2018 calling MPI_Init will cause a segfault most likely it's because it's the version for C++.

on versions that segfaults on MPI_Init i can usually call MPI_Get_version without getting a segfault. On version that doesn't calling MPI_Get_version will get me an error that stop the execution of my program.

is there a way to run a function created by dlsym and catch the segfault thrown by MPI_Init or to prevent MPI_Get_version from stopping the process?

here is a snipped of my code loading the library.

    this->opened_dll = dlopen(path.string().c_str(), RTLD_NOW | RTLD_LOCAL | RTLD_DEEPBIND );
    if(this->opened_dll == nullptr) {
        throw std::exception();
    }

    this->dll_MPI_Get_library_version = reinterpret_cast<MPI_Get_library_version>(dlsym(this->opened_dll, "MPI_Get_library_version"));
    this->dll_MPI_Get_version = reinterpret_cast<MPI_Get_version>(dlsym(this->opened_dll, "MPI_Get_version"));
    reinterpret_cast<MPI_Init>(dlsym(this->opened_dll, "MPI_Init"))(nullptr, nullptr);

when MPI_Init cause a segfault it's handle isn't null. i should check all return values but for now i just wanted to see if it's possible at all. my debugger tells me none of these values are at nullptr so that's that.

try/catching as expected does nothing since it's c code and exceptions doesn't exists in c.

i might try forking my process so i only crash the fork by that will force me to have communication to get the valid values back.

yugr
  • 19,769
  • 3
  • 51
  • 96
  • Welcome to Stackoverflow! As far as I remember choosing MPI libraries at runtime is not supported by the MPI standard. You choose your MPI implementation at compile time. – Dima Chubarov Jul 31 '23 at 14:04
  • 1
    segfault means "undefined behavior". Memory corruption. You're toast. Your goose is cooked. Your process is no longer stable, and you have no longer any guarantees whatsoever. All warranties are void. All exits are sealed. All animals at the zoos have been locked down, etc, etc, etc... Good night. – Sam Varshavchik Jul 31 '23 at 14:06
  • @SamVarshavchik "segfault means "undefined behavior". Memory corruption. You're toast." - Not technically correct. It simply means you tried to access memory outside your allocated address space. That does not imply undefined behaviour. It can be a valid thing for a program to do and you *can* install a signal handler to catch it - the Java virtual machine does this to figure out when it needs to allocate more memory for the program it is running. I agree that in most cases it means that you messed up, but not in *all* cases. – Jesper Juhl Jul 31 '23 at 14:11
  • I need to correct my earlier comment. OpenMPI supports loading the library at runtime, but there are caveats. https://www.open-mpi.org/faq/?category=running#loading-libmpi-dynamically – Dima Chubarov Jul 31 '23 at 14:14
  • 1
    _i might try forking my process so i only crash the fork_ That sounds like a good idea to me. If you apply yourself to the problem I'm sure you'll figure something out. – Paul Sanders Jul 31 '23 at 14:24
  • " i might try forking my process so i only crash the fork That sounds like a good idea to me. If you apply yourself to the problem I'm sure you'll figure something out." i was thinking this isn't good code so i wanted to know if there was a better way. – Marc BARBIER Jul 31 '23 at 15:04
  • 1
    A C++ module can only be called if the caller was compiled with same version of the same compiler, using the same compiler-options. And even the except problems. – Lorinczy Zsigmond Jul 31 '23 at 15:13
  • Can you think of a case where it segfaults without incurring in undefined behavior? Asking for a friend. – Something Something Jul 31 '23 at 20:15

0 Answers0