4

NOTE: I know that this has been asked many times before, but none of the questions have had a link to a concrete, portable, maintained library for this.

I need a C or C++ library that implements Python/Ruby/Perl like pack/unpack functions. Does such a library exist?

EDIT: Because the data I am sending is simple, I have decided to just use memcpy, pointers, and the hton* functions. Do I need to manipulate a char in any way to send it over the network in a platform agnostic manner? (the char is only used as a byte, not as a character).

Linuxios
  • 34,849
  • 13
  • 91
  • 116
  • 1
    Could you explain what `pack`/`unpack` do in those languages? – user541686 Mar 02 '12 at 00:03
  • Pack and unpack are used to turn variables into a binary string and vice-versa, respectively. They are very useful for creating packets and binary files and other binary interactions. The languages that I mentioned have this function in their standard library. C/C++ don't. – Linuxios Mar 02 '12 at 00:12
  • Oh hmm... FYI, this might not even be possible. It would require a great deal of metaprogramming (or IDE support) to get all the field information and such... the only native language I know that could probably do this is [D](http://dlang.org/). – user541686 Mar 02 '12 at 00:22
  • I mean something along the lines of `sprintf`, just the specifiers would be for binary data. Just the same sort of giant `switch` statement. – Linuxios Mar 02 '12 at 00:33
  • I wrote a small header only library for this: [Php pack/unpack C++](https://stackoverflow.com/a/62194081/2279422) in case anyone is looking for same functionality with a similar API – Waqar Jul 12 '20 at 19:50

3 Answers3

6

In C/C++ usually you would just write a struct with the various members in the correct order (correct packing may require compiler-specific pragmas) and dump/read it to/from file with a raw fwrite/fread (or read/write when dealing with C++ streams). Actually, pack and unpack were born to read stuff generated with this method.

If you instead need the result in a buffer instead of a file it's even easier, just copy the structure to your buffer with a memcpy.

If the representation must be portable, your main concerns are is byte ordering and fields packing; the first problem can be solved with the various hton* functions, while the second one with compiler-specific directives.

In particular, many compilers support the #pragma pack directive (see here for VC++, here for gcc), that allows you to manage the (unwanted) padding that the compiler may insert in the struct to have its fields aligned on convenient boundaries.

Keep in mind, however, that on some architectures it's not allowed to access fields of particular types if they are not aligned on their natural boundaries, so in these cases you would probably need to do some manual memcpys to copy the raw bytes to variables that are properly aligned.

Matteo Italia
  • 123,740
  • 17
  • 206
  • 299
  • Can you elaborate on the `#pragma`s? I use GCC. – Linuxios Mar 02 '12 at 00:32
  • @Linux_iOS.rb.cpp.c.lisp.m.sh: added some links about `#pragma pack`. – Matteo Italia Mar 02 '12 at 00:53
  • hton*/ntoh* only convert between host by order (may be big or little endian) and network byte order (effectively big endian). They do not provide a way to serialize/deserialize values which are stored _little endian_ in a portable manner. – fuzzyTew Nov 27 '16 at 17:19
  • I wrote a small header only library for this: [Php pack/unpack C++](https://stackoverflow.com/a/62194081/2279422) in case anyone is looking for same functionality with a similar API – Waqar Jul 12 '20 at 19:55
5

Why not boost serialization or protocol buffers?

Jon
  • 5,275
  • 5
  • 39
  • 51
2

Yes: Use std::copy from <algorithm> to operate on the byte representation of a variable. Every variable T x; can be accessed as a byte array via char * p = reinterpret_cast<char*>(&x); and p can be treated like a pointer to the first element of a an array char[sizeof(T)]. For example:

char buf[100];
double q = get_value();

char const * const p = reinterpret_cast<char const *>(&q);
std::copy(p, p + sizeof(double), buf);

// more stuff like that

some_stream.write(buf) //... etc.

And to go back:

double r;

std::copy(data, data + sizeof(double), reinterpret_cast<char *>(&r));

In short, you don't need a dedicated pack/unpack in C++, because the language already allows you access to its variables' binary representation as a standard part of the language.

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • 2
    I know. I want a pack/unpack for portable representation, not local byte manipulation. – Linuxios Mar 02 '12 at 00:13
  • 1
    You're guaranteed to get a valid object back only for types that are trivially copyable. I'm sure you already know this, but you didn't say it. – bames53 Mar 02 '12 at 00:55
  • This is not portable at all. You are not considering endianess for starters, and of course it only works for POD (plain old data) types. – Correa Mar 02 '12 at 04:37
  • @Linux_iOS.rb.cpp.c.lisp.m.sh: Well, what's a "portable" way of storing floats? What I'm saying is that you can use the byte-access to *implement* any sort of serialization that you like. For example, you can read/write unsigned integers with the usual algebraic manipulations (`buf[0] = n % 256; buf [1] = (n / 256) % 256;`, etc. (Though you need unsigned chars in that case.) And yes, this is only for fundamental types (not even PODs). – Kerrek SB Mar 02 '12 at 06:10