Defining the structure of a binary file in C++ 11

Question

Since I have to work with files in binary a lot, I would like to have a more abstract way to do that, I have have to perform the same loop over and over:

write an header
write different kind of chunks ( with different set of values ) in a given order
write an optional closing header

Now I would like to break down this problem in small building blocks, imagine if I can write something like what the DTD is for the XML, a definition of what can possibly be in after a given chunk or inside a given semantic, so I can think about my files in terms of building blocks instead of hex values or something like that, also the code will be much more "idiomatic" and less cryptic.

In the end, there something in the language that can help me with binary files from this prospective ?

In the language? No. Boost Serialization might help you out, though its serializations aren't quite portable. — Cory Nelson, Jul 15 '13 at 03:31

score 3 · Accepted Answer · answered Jul 15 '13 at 04:18

I'm not sure about C++11 specific features, but for C++ in general, streams make file I/O much easier to work with. You can overload the stream insertion (<<) and stream extraction (>>) operators to accomplish your goals. If you're not very familiar with operator overloading, chapter 9 of this site, which explains it well, along with numerous examples. Here's the particular page for overloading the << and >> operators in the context of streams.

Allow me to illustrate what I mean. Suppose we define a few classes:

BinaryFileStream - which represents the file you are trying to write to and (possibly) read from.
BinaryFileStreamHeader - which represents the file header.
BinaryFileStreamChunk - which represents one chunk.
BinaryFileStreamClosingHeader - which represents the closing header.

Then, you can overload the stream insertion and extraction operators in your BinaryFileStream to write and read the file (or any other istream or ostream).

...
#include <iostream> // I/O stream definitions, you can specify your overloads for
                    // ifstream and ofstream, but doing so for istream and ostream is
                    // more general

#include <vector>   // For holding the chunks

class BinaryFileStream
{
public:
...
    // Write binary stream
    friend const std::ostream& operator<<( std::ostream& os, const BinaryFileStream& bfs )
    {
         // Write header
         os << bfs.mHeader;

         // write chunks
         std::vector<BinaryFileStreamChunk>::iterator it;
         for( it = bfs.mChunks.begin(); it != bfs.mChunks.end(); ++it )
         {
             os << (*it);
         }

         // Write Closing Header
         os << bfs.mClosingHeader;

         return os;
    }
...
private:
    BinaryFileStreamHeader             mHeader;
    std::vector<BinaryFileStreamChunk> mChunks;
    BinaryFileStreamClosingHeader      mClosingHeader;
};

All you must do then, is have operator overloads for your BinaryFileStreamHeader, BinaryFileStreamChunk and BinaryFileStreamClosingHeader classes that convert their data into the appropriate binary representation.

You can overload the stream extraction operator (>>) in an analogous way, though some extra work may be required for parsing.

Hope this helps.

If you be really clever, your stream-input and stream-output functions can be merged into just one function, and use a boolean flag to switch whether you are reading or writing, since the majority of the time your stream-input and stream-output functions for your classes are identical anyway. — Jamin Grey, Jul 15 '13 at 04:32
well I suppose that if the syntax it's not there, it's not there, I was trying to avoid this procedural and programmatic style with something more abstract. — user2485710, Jul 15 '13 at 04:46
@JaminGrey you mean templating the value of an int/bool or putting an if/else inside 1 function ? — user2485710, Jul 15 '13 at 04:46
You could do it that way, but then every "ReadFileFormatX" in assembly would look like a hodgepodge of if()-elses() for every read, and slow things down. In my code I create one function for read, and one for write at the byte level (*stream.ReadBytes(&data, size_t byteCount), stream.WriteBytes(&data, size_t byteCount)*). Then I just have "Serialize(stream, &i)", which internally calls stream.SerializeBytes(&i, sizeof(int)), and if 'stream' was set to Read, SerializeBytes is actually a function pointer to ReadBytes(), and if set to Write, it's actually a pointer to WriteBytes(). — Jamin Grey, Jul 15 '13 at 15:54
Then I just specialize Serialize(stream, &type) for more advanced types, like std::string or custom classes. The custom class specializations just call *Serialize(stream, &myInt)*, and *Serialize(stream, &myFloat)* and etc... — Jamin Grey, Jul 15 '13 at 15:56

Defining the structure of a binary file in C++ 11

1 Answers1