3

So I have a construct called packet

struct Packet {
    unsigned int packet_type;
    wchar_t packet_length[128];
    wchar_t file_name[256];
    wchar_t template_name[256];
    wchar_t file_name_list[1024];
    wchar_t file_data[1024];

    void serialize(char * dat) {
        memcpy(dat, this, sizeof(Packet));
    }

    void deserialize(const char * dat) {
        memcpy(this, dat, sizeof(Packet));
    }
};

I'm trying to desieralize from this data

{byte[2692]}
[0]    0       unsigned int packet_type; (4 bytes)
[1]    0
[2]    0
[3]    0
[4]    50 '2'  wchar_t packet_length[128]; (128 bytes)
[3]    0
[5]    54 '6'
[3]    0
[6]    57 '9'
[3]    0
[7]    50 '2'
[8]    0
[...]  0
[132]  112 'p'  wchar_t file_name[256]; (256 bytes)
[133]  0
[134]  104 'h'
[...]  0

But the memcpy in deserialze isn't giving me the file_name, but it does give me the packet_length. What's up with this? Thanks!

EDIT: So it's clear to me now that wchar_t is taking up more space than I once thought; however, I'm being told not to use memcpy at all?

I've written this deserialize method and it grabs the data correctly. Will this still cause a security leak?

void deserialize(const char * dat) {
        memcpy(&(packet_type), dat, 4);
        memcpy(&(packet_length[0]), dat + 4, 128);
        memcpy(&(file_name[0]), dat + 132, 256);
        memcpy(&(template_name[0]), dat + 388, 256);
        memcpy(&(file_name_list[0]), dat + 644, 1024);
        memcpy(&(file_data[0]), dat + 1668, 1024);
    }
jacobsgriffith
  • 1,448
  • 13
  • 18
  • 1
    Double check the size of `wchar_t` on your target system. I'm betting it's not 1 byte so you're size expectations are wrong. `wchar_t file_name[256]` should be at least 512 bytes not 256. Give then last section of your question I'm betting you're looking at the wrong offset into the array. – Captain Obvlious Apr 23 '14 at 18:29
  • I would imagine that this is largely undefined behavior. Even if it isn't, you are also mixing `char` with `wchar_t`. Since it looks like the size of `wchar_t` on your system is two bytes rather than four, I'm guessing this is being run on a Windows machine. Check out [wmemcpy](http://msdn.microsoft.com/en-us/library/dswaw1wk.aspx) and [wmemcpy_s](http://msdn.microsoft.com/en-us/library/wes2t00f.aspx). – jliv902 Apr 23 '14 at 18:36
  • Please don't use magic numbers for type sizes (e.g. 4 for the size of an unsigned int). The safest thing is to use sizeof( varname ). That accounts not just for the size of the type of varname, but handles instances of varname's type being changed. – Rob K Apr 23 '14 at 19:04

2 Answers2

5

Please do not use this method for serializing structures. It's utterly non-portable.

Moreover, the compiler might as well pad, align or reorder members depending on the target architecture, endianness, optimizations and a bunch of other things.

A much more elegant way would be to use boost::Serialization, which takes care of low-level details in a portable way.

If, on the other hand, you just want to inspect your structures, then a debugger comes handy...

Roberto Reale
  • 4,247
  • 1
  • 17
  • 21
2

The layout of your char array assumes that the size of wchar_t is two bytes; it is not - here is an example of a system where the size of wchar_t is 4, so the size of Packet is 10756, not 2692 bytes: (link to a demo).

That is why your memcpy trick from the edit presents a problem: it assumes that the layout of data in the char[] array matches the layout of wchar_t[] arrays, which it may or may not match. If you know that your data array has two-character elements stored in little endian format (LSB first), you can write your own function that converts the data from the source to the destination, and call it for portions of your serialized data, like this:

void bytes_to_wchar(wchar_t *dest, const unsigned char* src, size_t length) {
    for (size_t i = 0 ; i != lengt ; i++) {
        dest[i] = src[2*i] | (src[2*i+1] << 8);
    }
}

Now you can use this function to copy data into wchar_t arrays independently of the wchar_t size on the target system, or the endianness of the target system:

void deserialize(const char * dat) {
    bytes_to_wchar(packet_type,       dat + 0,      4);
    bytes_to_wchar(packet_length[0],  dat + 4,    128);
    bytes_to_wchar(file_name[0],      dat + 132,  256);
    bytes_to_wchar(template_name[0],  dat + 388,  256);
    bytes_to_wchar(file_name_list[0], dat + 644,  1024);
    bytes_to_wchar(file_data[0],      dat + 1668, 1024);
}

The shortcut of saving the data from memory and writing it back may work when you do it on the same hardware, using the same compiler. Even then it remains sensitive to small adjustments in the headers that you use and in the settings of the compiler.

If the character array that you need to copy into the struct has a fixed layout, you need to write a function to process that layout, converting two-byte groups into wchar_ts, four-byte groups into unsigned ints, and so on.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523