0

I looked at documentations of both Boost Serialization and Cereal library but I didn't find anything specific. I'm wondering if it's possible to have an integrity check in the serialized data. I'm not talking about the security aspect but if for any reason the file saved is corrupted, the data loaded can be totally wrong. Is there anything supported in these libraries about this aspect? I thought to implement something similar myself but there's a problem in the load phase for both libraries:

template<class Archive>
void load(Archive& ar) {
   //checksum here??
   ar >> mydata;
}

In order to calculate the checksum I need to read all data. However for both libraries I can't extract anything from archive, I can just fill the class attributes hoping everything is loaded without errors and then I can calculate the checksum. I'd like to calculate the checksum before to load class attributes. Is it possible?

greywolf82
  • 21,813
  • 18
  • 54
  • 108

1 Answers1

0

I don't know of a serialisation that does this specifically.

When necessary, what's sometimes done is to serialise an object, and then that byte stream and its hash are the fields in another intermediate object which is also serialised. This final byte stream is what is transmitted.

On reception that byte stream is deserialised, regenerating the intermediate object. The hash of its bytestream field is computed and compared to its hash field. If everything is OK, then the bytestream field can be deserialised safely to regenerate the original object.

This is a little inefficient - there's two objects to serialise. However, the intermediate object is mostly just a bytestream, and this can often be very trivial to serialise (especially for binary serialisers like GPB).

It's also quite often unnecessary. Things like filesystems, TCP, etc. already have a load of error checking and correction built into them. If your transport / storage medium already has a load of data integrity checking built into it, supplementing it might be overkill. You mention file storage - using a filesystem like ZFS would be an excellent way of ensuring that the data integrity is good (plus a load of other benefits), reducing the need for your own check. No matter what you do, ZFS will be applying error correction to stored data anyway.

bazza
  • 7,580
  • 15
  • 22