How to iterate over archive in boost::serialization

Question

I loaded multiple data into boost::archive::text_oarchive, now I need to extract the data. But because the archive contains multiple records, I would need an iterator.

something like

//input archive
boost::archive::text_iarchive iarch(ifs);

//read until the end of file
while (!iarch.eof()){

//read current value
iarch >> temp;
...//do something with temp

}

is there any standard way to iterate over elements of the archive? I found only iarchive.iterator_type, but is it what I need and how do I use it?

How about an idea either to serialize a vector of "records" ([the serialization handles STL's vectors, etc.](http://www.boost.org/doc/libs/1_53_0/libs/serialization/doc/index.html)) or to serialize initially a number of elements in a "record" container and after serialize the container itself. — megabyte1024, Mar 25 '13 at 14:23
I was going to say something similar to megabyte, see http://www.boost.org/doc/libs/1_53_0/libs/serialization/doc/index.html (search for STL Collections) — Caribou, Mar 25 '13 at 14:26
Yes, I was thinking about it, but the thing is that I have `std::vector` of pointers to some complex derived datatype with its own `serialize` implemented as private, and when I want to push the whole vector into archive, the compiler complains that method `serialize` is missing in the object `std::vector`, so I had to write to archive element by element of type `Derived*` from the vector — Vyacheslav, Mar 25 '13 at 14:41
I'm fairly sure the iterator_type you are looking at is used during the operation of the class and not for external applications. The iterator_type comes from the derivation from `public detail::shared_ptr_helper` — Caribou, Mar 25 '13 at 14:44
edit to comment: not `std::vector`, but `std::list`. But shouldn't really matter — Vyacheslav, Mar 25 '13 at 14:51
ok seems like my answer isn't too helpful to your problem :) I'll leave it for now — Caribou, Mar 25 '13 at 14:56
Thanks, got it. Forgot to include boost/serialization/list.hpp >_ — Vyacheslav, Mar 25 '13 at 15:33

Caribou · Accepted Answer · 2013-03-25T14:59:43.053

The iterator type you are looking at actually comes from

class shared_ptr_helper {
    ...
    typedef std::set<
        boost::shared_ptr<const void>,
        collection_type_compare
    > collection_type;
    typedef collection_type::const_iterator iterator_type;

which is used during the load of the archive rather than being an iterator for external use I think.

If you look at the link http://www.boost.org/doc/libs/1_53_0/libs/serialization/doc/index.html under tutorial -> STL Collection you will see the following example:

#include <boost/serialization/list.hpp>

class bus_route
{
    friend class boost::serialization::access;
    std::list<bus_stop *> stops;
    template<class Archive>
    void serialize(Archive & ar, const unsigned int version)
    {
        ar & stops;
    }
public:
    bus_route(){}
};

If that isn't quite what you need then you would probably need to look at overriding load and save as per http://www.boost.org/doc/libs/1_53_0/libs/serialization/doc/tutorial.html#splitting and adding handling as required.

That was my plan initially, but the compiler says `‘class std::list’ has no member named ‘serialize’`. Even though `serialize` is implemented in `Derived` — Vyacheslav, Mar 25 '13 at 15:10

alfC · Answer 2 · 2023-08-29T23:32:26.437

Although it is not formally defined, the concept of archive doesn't include the ability to detect the end. Archives should be self-contained in a sense: you should be deterministically able just to read (deserialize) the correct amount of data because that information is itself part of the archive.

boost::archive::text_iarchive iarch(ifs);

int size;
iach >> size;

for(int i = 0; i != size; ++i) {
  T temp;
  iarch >> temp;
  ...//do something with temp
}

As you can see, there is no unbounded while loop. There is a certain logic that distinguishes an archive from some arbitrary stream of data, even if the syntax looks similar.

Having said that, for some archives, and if you have access to the underlying stream, you can check for the end of the stream.

boost::archive::text_iarchive iarch(ifs);
while(ifs) {
  T temp;
  iarch >> temp;
  ...//do something with temp
}

However, big HOWEVER, this breaks the abstraction of the archive, and it will not work for archives that have tags (like XML archives) because the last entry of the archive doesn't necessarily coincide with the end of the underlying stream.

Once you relax the abstraction of the archive, even the concrete one, you try to read tea leaves. If you are trying to read streams using archives facilities, you might be in better shape by inverting the logic and interpreting the stream as a bunch of archives instead.

... stream logic to get to the point where an archive starts
while(ifs) {
  {
    boost::archive::text_iarchive iarch(ifs, boost::archive::no_header);
    T temp;
    iarch >> temp;  // using archive deserialization, not low level stream reading
  }
  ... stream logic to process until the next "archive" starts, possibly break
}

In this way, you are not breaking the archive abstraction, at least not directly; the stream is not manipulated directly when the archive abstraction is alive.

If it sounds complicated, it is because it is, and this because you are trying to hack your way in.

How to iterate over archive in boost::serialization

2 Answers2