9

I serialize multiple objects into a binary archive with Boost. When reading back those objects from a binary_iarchive, is there a way to know how many objects are in the archive or simply a way to detect the end of the archive ?

The only way I found is to use a try-catch to detect the stream exception. Thanks in advance.

rcollyer
  • 10,475
  • 4
  • 48
  • 75
Shnippoo
  • 193
  • 2
  • 10
  • If using try-catch is the only way it'll work for you, then you should answer your own question as such, and accept your own answer. Doing this is permitted in Stack Overflow (see FAQ). – Emile Cormier Jul 18 '11 at 17:53

5 Answers5

6

I can think of a number of approaches:

  1. Serialize STL containers to/from your archive (see documentation). The archive will automatically keep track of how many objects there are in the containers.

  2. Serialize a count variable before serializing your objects. When reading back your objects, you'll know beforehand how many objects you expect to read back.

  3. You could have the last object have a special value that acts as a kind of sentinel that indicates the end of the list of objects. Perhaps you could add an isLast member function to the object.

  4. This is not very pretty, but you could have a separate "index file" alongside your archive that stores the number of objects in the archive.

  5. Use the tellp position of the underlying stream object to detect if you're at the end of file:

Example (just a sketch, not tested):

std::streampos archiveOffset = stream.tellg(); 
std::streampos streamEnd = stream.seekg(0, std::ios_base::end).tellg();
stream.seekg(archiveOffset);

while (stream.tellp() < streamEnd)
{
    // Deserialize objects
}

This might not work with XML archives.

Emile Cormier
  • 28,391
  • 15
  • 94
  • 122
  • 1
    Thanks for you answer @Emile , containers would be great indeed, the only problem is that my algo is anytime, ie. it's iterative and at the end of the iteration some stuff are serialized. And if the algorithm stops unexpectedly at some point, I can still resume the run from archive :) – Shnippoo Jul 16 '11 at 08:12
  • Wow... first time i shoot myself in the foot with a SO solution. The above code (#5) is broken! (at least for my boost version) If you create an archive, the archive constructor might seek to some nonzero offset in the stream. Hence the later `stream.seekg(0)` will yield an error when reading from the archive because the streampointer is not where the archive expects him to be! – Marti Nito Sep 15 '16 at 23:29
  • 1
    @MartiNito: StackOverflow is not a free code delivery service. There are no guarantees to the code samples provided. You use them at your own risk. I had obviously not tested that example; I had merely provided it to better explain a possible solution to the OP's problem. Having said that, I thank you for finding the error and suggesting an improvement. Now others will benefit from your contribution. :-) – Emile Cormier Sep 16 '16 at 03:49
  • Yeah, I didnt mean to offend! I just took your code witout checking. You event mentioned that it might be broken (for xml). That should have put me allert. So it was really my fault:) – Marti Nito Sep 17 '16 at 13:45
0

Do you have all your objects when you begin serializing? If not, you are "abusing" boost serialization - it is not meant to be used that way. However, I am using it that way, using try catch to find the end of the file, and it works for me. Just hide it away somewhere in the implementation. Beware though, if using it this way, you need to either not serialize pointers, or disable pointer tracking.

If you do have all the objects already, see Emile's answer. They are all valid approaches.

Cookie
  • 12,004
  • 13
  • 54
  • 83
0

Sample code which I used to debug the similar issue (based on Emile's answer) :

#include <fstream>
#include <iostream>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/binary_iarchive.hpp>

struct A{
    int a,b;
    template <typename T>
    void serialize(T &ar, int ){
        ar & a;
        ar & b;
    }
};


int main(){
    {
        std::ofstream ofs( "ff.ar" );
        boost::archive::binary_oarchive ar( ofs );
        
        for(int i=0;i<3;++i){
            A a {2,3};
            ar << a;
        }
        ofs.close();
    }

    {
        std::ifstream ifs( "ff.ar" );
        ifs.seekg (0, ifs.end);
        int length = ifs.tellg();
        ifs.seekg (0, ifs.beg);
    
        boost::archive::binary_iarchive ar( ifs );

        while(ifs.tellg() < length){
            A a;
            ar >> a;
            std::cout << "a.a-> "<< a.a << " and a.b->"<< a.b << "\n";
        }
    }
    return 0;
}
Mohit
  • 1,859
  • 1
  • 16
  • 25
-1

you just read a byte from the file.

If you do not reach the end,

backword a byte then.

Steven Shih
  • 645
  • 1
  • 10
  • 22
-1
std::istream* stream_;
boost::iostreams::filtering_streambuf<boost::iostreams::input>* filtering_streambuf_;
...
stream_ = new std::istream(memoryBuffer_);
if (stream_) {
  filtering_streambuf_ = new boost::iostreams::filtering_streambuf<boost::iostreams::input>();
  if (filtering_streambuf_) {
    filtering_streambuf_->push(boost::iostreams::gzip_decompressor());
    filtering_streambuf_->push(*stream_);

    archive_ = new eos::portable_iarchive(*filtering_streambuf_);
  }
}

using zip when reading data from the archives, and filtering_streambuf have such method as

std::streamsize std::streambuf::in_avail()
Get number of characters available to read

so i check the end of archive as

bool    IArchiveContainer::eof() const {
    if (filtering_streambuf_) {
        return filtering_streambuf_->in_avail() == 0;
    }
    return false;
}

It is not helping to know how many objects are last in the archive, but helping to detect the end of them (i'm using eof test only in the unit test for serialization/unserialization my classes/structures - to make sure that i'm reading all what i'm writing)

brass monkey
  • 5,841
  • 10
  • 36
  • 61
morfik
  • 9
  • 2
  • This won't work reliably. `in_avail` returns how much data is buffered. There could still be more data on disk or wherever it is reading from. – Brice M. Dempsey Jul 24 '23 at 03:00