Trouble deserializing cereal PortableBinaryArchive

Question

I face a std::length exception using the cereal library to deserialize a std::vector full of a class of my own. I think it's easiest if I give some code. This is my class:

#include "cereal/archives/portable_binary.hpp"
#include "cereal/archives/json.hpp"
#include "cereal/types/vector.hpp"

enum class myType {
    None, unset, Type1, Type2, Type3
};

class myClass
{
public:
    myClass();
    myClass(size_t siz);
    ~myClass();
    std::vector<size_t> idxs;
    myType dtype;
    bool isvalid;

    // This method lets cereal know which data members to serialize
    template<class Archive>
    void serialize(Archive & archive)
    {
        archive(CEREAL_NVP(dtype), CEREAL_NVP(isvalid), CEREAL_NVP(idxs));
    }
protected:
private:
};

The idxs member is not necessarily always of the same size. After some calculations I then obtain a

std::vector<myClass> allData;

which I want to serialize and later deserialize in another application. This is my code for serialization:

std::ofstream ofile(allDataFilename.c_str());
if (ofile.good())
{
    cereal::PortableBinaryOutputArchive theArchive(ofile);
    theArchive(allData);
    //ofilefp.close();   // Do not close because of RAII where dtor of cereal archive does some work on file!
}
else
{
    std::cout << "Serialization to portable binary archive failed. File not good." << "\n";
}

The generated data file is not null size and not all zero, so just from the looks it is fine. This is what I do to deserialize in the other application:

std::string allDataFilename("C:\\path\\to\\file\\data.dat");
std::ifstream infile(allDataFilename.c_str());
std::vector<myClass> myDataFromDisk;
if (infile.good())
{
    cereal::PortableBinaryInputArchive inArchive(infile);
    inArchive >> myDataFromDisk;
}
else
{
    std::cout << "Data file unavailable." << "\n";
}

When I run this deserialization code, I get an exception "std::length_error". Somehow related discussion to this error is here but to me it seems not to be relevant for my case. (Or is it?)

I tried to de-/serialize with separate load/save functions, because I was not sure whether this part of the cereal documentation applies here:

When possible, it is preferred to use a single internal serialize method, though split methods can be used when it is necessary (e.g. to dynamically allocate memory upon loading a class).

I also tried to archive each vector element of the idxs member separately in a range based for loop (like it is done in cereal internally anyway), but both things didn't help.

Both applications are compiled with Visual Studio 2015 Update 3. I use the current cereal v1.2.2 but also tried with cereal v1.1.2 which gave me a bit-identical serialization result.

As an aside: It works with a cereal JSON archive. But only after I changed the serialization call to

archive(CEREAL_NVP(dtype), CEREAL_NVP(isvalid), CEREAL_NVP(idxs));

whereas it didn't work with JSON when the vector member came first on serialization. But this might be completely unrelated.

archive(CEREAL_NVP(idxs), CEREAL_NVP(dtype), CEREAL_NVP(isvalid));

Now my questions:

1) Is this the way serialization is supposed to work with cereal?

2) Do I need to add more serialization functions? E.g. to the enum class?

Best regards AverageCoder

score 0 · Accepted Answer · answered Feb 16 '17 at 19:48

There is nothing wrong with your class in regards to your serialization code. You do not need to provide serialization for enums, it is included automatically via cereal/types/common.hpp. The order that your fields are serialized in does not matter.

Your error comes in not using the archives properly when performing loading and saving. cereal handles all of the interface with the stream, so you should not use the streaming operators (i.e. << or >>) directly on a cereal archive. Take another look at the examples on the cereal website and you'll notice that whenever there is interaction with a cereal archive, it is done through the () operator.

You should also ensure that you are using the appropriate flag (std::ios::binary) when operating on streams that deal with binary data - this can prevent some problems that are hard to debug.

Here is a working example using your class where I'm saving to an in-memory stream rather than a file, but the principle is the same:

#include <cereal/archives/portable_binary.hpp>
#include <cereal/archives/json.hpp>
#include <cereal/types/vector.hpp>
#include <algorithm>
#include <sstream>

enum class myType {
    None, unset, Type1, Type2, Type3
};

class myClass
{
public:
  myClass() = default;
  myClass( myType mt, size_t i ) : isvalid( true ), dtype( mt ),
                                idxs( i )
  {
    std::iota( idxs.begin(), idxs.end(), i );
  }

  std::vector<size_t> idxs;
  myType dtype;
  bool isvalid;

  // This method lets cereal know which data members to serialize
  template<class Archive>
  void serialize(Archive & archive)
  {
    archive(CEREAL_NVP(dtype), CEREAL_NVP(isvalid), CEREAL_NVP(idxs));
  }
};

int main(int argc, char* argv[])
{
  std::vector<myClass> allData = {{myType::None, 3}, {myType::unset, 2}, {myType::Type3, 5}};

  // When dealing with binary archives, always use the std::ios::binary flag
  // I'm using a stringstream here just to avoid writing to file
  std::stringstream ssb( std::ios::in | std::ios::out | std::ios::binary );
  {
    cereal::PortableBinaryOutputArchive arb(ssb);
    // The JSON archive is only used to print out the data for display
    cereal::JSONOutputArchive ar(std::cout);

    arb( allData );
    ar( allData );
  }

  {
    cereal::PortableBinaryInputArchive arb(ssb);
    cereal::JSONOutputArchive ar(std::cout);

    std::vector<myClass> data;
    arb( data );

    // Write the data out again and visually inspect
    ar( data );
  }

  return 0;
}

and its output:

{
    "value0": [
        {
            "dtype": 0,
            "isvalid": true,
            "idxs": [
                3,
                4,
                5
            ]
        },
        {
            "dtype": 1,
            "isvalid": true,
            "idxs": [
                2,
                3
            ]
        },
        {
            "dtype": 4,
            "isvalid": true,
            "idxs": [
                5,
                6,
                7,
                8,
                9
            ]
        }
    ]
}{
    "value0": [
        {
            "dtype": 0,
            "isvalid": true,
            "idxs": [
                3,
                4,
                5
            ]
        },
        {
            "dtype": 1,
            "isvalid": true,
            "idxs": [
                2,
                3
            ]
        },
        {
            "dtype": 4,
            "isvalid": true,
            "idxs": [
                5,
                6,
                7,
                8,
                9
            ]
        }
    ]
}

Specifying the std::ios::binary flag did not help. Using the operator() instead of operator>> did not make a change either. I think I have to dig deeper and strip my example further down or try loading within the same app. — AverageCoder, Feb 17 '17 at 15:44
Serializing and deserializing within the same application makes no difference. It also fails. The next thing I will try is the following: My allData std::vector is reserved to a bigger size than what will eventually end in it (size < reserved), perhaps that is a problem on serializing. I will update the question here according to my findings what worked and also what didn't work. — AverageCoder, Feb 21 '17 at 07:19
cereal resizes the vector appropriately during a load - your reserved capacity shouldn't make any difference. I encourage you to look again at the example I posted and those on the cereal website, as from your descriptions I think you are making a simple interface error somewhere that is causing your problems. — Azoth, Feb 21 '17 at 20:59
To give a short update: I haven't yet found what the issue was but for now the situation is: After changing some calculations in myClass of the idxs member, the new output can be read by cereal as portableBinaryArchive. The number of items in the allData-vector is now only half of what it was originally (now ~8K items of myClass, formerly ~16K). However, opening the failing old archive via JSON (what works) and the new archive via portableBinaryArchive, I can see no difference in structure in the debugger. I am a bit stumped what to learn from all that, as I don't think it is a size issue. — AverageCoder, Mar 08 '17 at 13:42
As the problem probably is somewhere on my side, I will accept this answer as it answered my question such that I am not doing anything fundamentally wrong here :-) Thanks again, @Azoth. — AverageCoder, Mar 08 '17 at 13:44

Trouble deserializing cereal PortableBinaryArchive

1 Answers1