1

I'm trying to serialize a std::vector as a member in the class, the attached code fails at throws segmentation fault that I've not been able to resolve this issue for weeks now. It's weird as it goes well in smaller size of container, but even this size is very small for memory issues.

#include <iostream>
#include <boost/mpi/environment.hpp>
#include <boost/mpi/communicator.hpp>
#include <boost/serialization/vector.hpp>
#include <boost/mpi.hpp>
#include <boost/version.hpp>
class Dummy
{
private:
    std::vector<unsigned int> array;
    friend class boost::serialization::access;
    template <class T> 
    inline void serialize(T & ar, const unsigned int /*version*/)
    {ar & array;}
public:
    Dummy()
    {};
    Dummy(const int nDivs) 
    {array.resize(nDivs);}
    ~Dummy() {}; 
};
int main(int argc, char* const argv[]) 
{
    boost::mpi::environment   env; 

    boost::mpi::communicator  world;

    boost::mpi::request req; 

    std::cout << "Using Boost "     
          << BOOST_VERSION / 100000     << "."  // major version
          << BOOST_VERSION / 100 % 1000 << "."  // minor version
          << BOOST_VERSION % 100                // patch level
          << std::endl;

    int MASTER = 0; 
    int tag    = 0;
    std::size_t size = 10000;  
    std::vector<unsigned int> neighbrs = {0, 1, 2, 3};
    std::vector<Dummy> TobeSend (size, Dummy(2)); 
    std::vector<Dummy> TobeRecvd (size ,Dummy(1));

    for(auto itri:neighbrs)
    {
        int target = itri; 
        if(world.rank()!= target)
        {

            world.isend(target, tag, TobeSend);
            
        }
    }
    for(auto isource:neighbrs)
    {
        int source = isource;
        if(world.rank()!= source)
        {
            req= world.irecv(source, tag, TobeRecvd); 
            req.test(); 
        }
    }
    return 0; 
}

I'm building the code with: mpic++ -g mpidatatype.cpp -o output -lboost_mpi -lboost_serialization I've tried both version of 1.75 and 1.82. I'd appreciate help on this problem.

Here's the call stack as start of the program before sending occurs: enter image description here

MA19
  • 510
  • 3
  • 15
  • If you are getting segmentation fault then it would be a good idea to show a call stack when it happens. Also you should specify compiler version and the way you run this executable. – user7860670 Jul 13 '23 at 07:01
  • Yeah. The code looks fine. It runs fine for me. Since you have different library versions check that you are actually finding the right versions at runtime. We need more information to be able to help with that. Versions, platform etc. – sehe Jul 13 '23 at 11:58
  • @sehe Thank you for comment. I'm in Ubuntu 22.04.2, and I've installed the latest version of boost -mpi in my machine, that's 1.82. And also it prints 1.82 as boost version; so I think it's finding the right version. What other information is needed? I've tried this code in so many ways including container to use boost_multi array or 2D vector or static array and all failed differently. One time, I'll get std::bad_alloc or std:::bad_length or segmentation fault; it's getting really annoying for me – MA19 Jul 13 '23 at 17:17
  • I've tested in my personal machine (Ubuntu 20.04 and boost version of 1.71) and I got the same error. I can't understand this :( – MA19 Jul 13 '23 at 17:39
  • "I've installed the latest version of boost-mpi" - how? Because the distro latest version is 1.74 https://packages.ubuntu.com/jammy/libs/libboost-mpi1.74.0 - In general just expand the question with exact steps. All the symptoms you are reporting can be summarized to "Undefined Behaviour". This happens when you violate ODR or have ABI mismatch. You just have to trace down to the source and you can breathe again – sehe Jul 13 '23 at 18:33
  • @sehe I downloaded from here https://www.boost.org/users/download/ and then followed instructions https://kratos-wiki.cimne.upc.edu/index.php/How_to_compile_the_Boost_if_you_want_to_use_MPI to build with MPI support. I've debugged, but everything comes from behind the scenes, and it's not easy for me to see why and how it's failling – MA19 Jul 14 '23 at 06:11
  • Using GDB, I know at least one processor won't be able to receive the message and then this weird error will be thrown that's really at the end of process – MA19 Jul 14 '23 at 06:12
  • You could show us the backtrace. Even though "everything comes from behind the scenes" someone else might be able to something about it – sehe Jul 14 '23 at 11:58
  • @sehe I added call stack, is this what you mean? – MA19 Jul 14 '23 at 15:48
  • Yes. It would be more informative if not truncated (because of the image). But at least now we know it's an allocation triggering the segv. Maybe you can mark the source line in main corresponding to that stack trace – sehe Jul 14 '23 at 15:55
  • 1
    @sehe finally, I found the answer, I added namespace `boost { namespace mpi { template<> struct is_mpi_datatype : public mpl::true_ { }; } }` and the problem went away, but I had thought that this piece of code was optional, not necessary. – MA19 Jul 16 '23 at 17:35

1 Answers1

1

So, I resolved this finally. The problem was missing

  namespace boost { namespace mpi {
      template<> struct is_mpi_datatype<Dummy>
        : public mpl::true_ { };
    } }

adding this line of code resolved my issue

MA19
  • 510
  • 3
  • 15