1

I want to transfer a large amount of std::complex<double> numbers using Boost.MPI. In the Boost.MPI tutorial it is explained that

To obtain optimal performance for small fixed-length data types not containing any pointers it is very important to mark them using the type traits of Boost.MPI and Boost.Serialization.

It was already discussed that fixed length types containing no pointers can be using as is_mpi_datatype, e.g.:

namespace boost { namespace mpi {   template <>   struct
is_mpi_datatype<gps_position> : mpl::true_ { }; } }

or the equivalent macro

BOOST_IS_MPI_DATATYPE(gps_position)

The documentation of is_mpi_datatype gives another example:

[...] To do so, first make the data type Serializable (using the Boost.Serialization library); then, specialize the is_mpi_datatype trait for the point type so that it will derive mpl::true_:

namespace boost { namespace mpi {
  template<> struct is_mpi_datatype<point>
    : public mpl::true_ { };
} }

When I try exactly that to optimize performance [Options (A) or (B) in my attempt below], I observe that Boost.MPI does not use the builtin MPI datatype MPI_DOUBLE_COMPLEX nor does it map to the MPI_SUM operation [assertions (3) and (4) in my attempt below]. Moreover, enabling one of (A) or (B) as well as disabling assertions (3) and (4) yields a segmentation fault at runtime.

In some source file of Boost.MPI I have found an undocumented(?) macro called BOOST_MPI_DATATYPE which does the right thing, but is marked with the comment /// INTERNAL ONLY.

Before implementing this ugly hack(?) I would like to ask: What is the intended way to tell Boost.MPI to use the builtin MPI_DOUBLE_COMPLEX datatype for std::complex<double>?

#include <complex>
#include <functional>
#include <iostream>

#include <boost/mpi.hpp>
#include <boost/mpi/operations.hpp>
#include <boost/serialization/complex.hpp>

// tested with GCC 6.2.0, OpenMPI 2.0.1, boost 1.62.0
// mpic++ -lboost_mpi -lboost_serialization boost-mpi-complex.cpp

using dcomplex = std::complex<double>;
using dcplus = std::plus<dcomplex>;

////////////////////////////////////////////////////////////////////////////////
// How to pass assertions (1) to (5) below?
////////////////////////////////////////////////////////////////////////////////

// (A): documented, but fails assertions (2) and (3)
// if (2) and (3) are removed with this OPTION: segmentation fault at runtime
//BOOST_IS_MPI_DATATYPE(dcomplex)

// (B): documented, but same problems as (A)
namespace boost::mpi {
//template<> struct is_mpi_datatype<dcomplex> : boost::mpl::true_ {};
}

// (C): works, but not documented(?) and has `INTERNAL ONLY` comment in source
namespace boost::mpi {
//BOOST_MPI_DATATYPE(dcomplex, MPI_DOUBLE_COMPLEX, complex);
}

// (D): works, equivalent to (C)
// BUT if `is_mpi_complex_datatype` is specialized without `get_mpi_datatype`
// then compilation is fine with all assertions, but running yields segfault
namespace boost::mpi {
//template<> inline MPI_Datatype get_mpi_datatype<dcomplex>(const dcomplex&) {
//  return MPI_DOUBLE_COMPLEX;
//}
//template<> struct is_mpi_complex_datatype<dcomplex> : boost::mpl::true_ {};
}

// optional; works as expected for assertion (4)
namespace boost::mpi {
template<> struct is_commutative<dcplus, dcomplex> : mpl::true_ {};
}

// If these assertions are removed and none of (A) to (D) is activated then
// everything works as expected, but I would like to optimize serialization
static_assert(boost::mpi::is_mpi_datatype<dcomplex>{});                   // (1)
static_assert(boost::mpi::is_mpi_builtin_datatype<dcomplex>{});           // (2)

static_assert(boost::mpi::is_mpi_op<dcplus, dcomplex>{});                 // (3)
static_assert(boost::mpi::is_commutative<dcplus, dcomplex>{});            // (4)

static_assert(boost::serialization::is_bitwise_serializable<dcomplex>{}); // (5)

int main() {
  boost::mpi::environment env{};
  boost::mpi::communicator world{};

  constexpr size_t N = 4;

  dcomplex data[N]{};
  if(0 == world.rank()) {
    for(size_t i=0; i<N; ++i) data[i] = dcomplex{double(i), 0.0};
  }
  if(1 == world.rank()) {
    for(size_t i=0; i<N; ++i) data[i] = dcomplex{0.0, double(N+i)};
  }

  all_reduce(world, boost::mpi::inplace(data), N, dcplus{});

  if(0 == world.rank()) {
    for(auto&& x : data) std::cout << x << std::endl;
  }

  return 0;
}
Julius
  • 1,816
  • 10
  • 14
  • option (D) looks correct to me. What is the problem with that one? assertion (2) should always fail, since complex is not builtin, and you shouldn't mess with the boost internals. I expect (3) should also fail, since you are providing a user defined operator. This should be fine nonetheless. – Patrick Nov 28 '16 at 21:08
  • To me, (D) feels as `INTERNAL ONLY` as (C). Moreover, the [Boost.MPI](http://www.boost.org/doc/libs/1_62_0/doc/html/boost/mpi/is_mpi_builtin_datatype.html) documentation states that "In general, users should not need to specialize this trait.". The [MPI specification 3.1](http://mpi-forum.org/docs/mpi-3.1/mpi31-report.pdf) lists `MPI_CXX_DOUBLE_COMPLEX` and `MPI_DOUBLE_COMPLEX` as "Predefined MPI datatypes". Finally, `std::plus` [is said to map to MPI_PLUS](http://www.boost.org/doc/libs/1_62_0/doc/html/mpi/tutorial.html#mpi.tutorial.c_mapping.t5). – Julius Dec 07 '16 at 12:21
  • Correction: std::plus [is said to map to MPI_SUM](http://www.boost.org/doc/libs/1_62_0/doc/html/mpi/tutorial.html#mpi.tutorial.c_mapping.t5). – Julius Dec 07 '16 at 12:29
  • Since boost doesn't properly supply these builtin datatypes, I think you should just define it as if you had a class/struct with a custom MPI_Datatype. I.e. specialize get_mpi_datatype to return the corresponding MPI_DOUBLE_COMPLEX. This will still use custom operators for the reduction, but this should be the same efficiency as the builtin reduction MPI_SUM (in my experience at least). – Patrick Dec 07 '16 at 22:31

0 Answers0