1

Problem:

I wrote an object into a file in binary mode using std::fstream. However, when I read it back from that file to another object and then call one of the virtual member functions of this new object, there's a memory access violation error.

Here's what the code looks like:

class A
{
public:
   // data
   virtual void foo() = 0;
}; 

class B: public A
{
public:
   // added data 
   virtual void foo() { ... }
}

int main()
{
  // ...
  A* a = new B();
  A* b = new B();

  file.write((char*)a, sizeof(B));
  // ...
  thatSameFile.read((char*)b, sizeof(B));

  b->foo(); // error here

}

What I've figured out:

After a couple of hours debugging, I see that the __vfptr member of b (which, as far as I know, is the pointer to the virtual table of that object) is changing after the file read statement. I guess that not only did I write the data of a to file and copy those to b, I also copied the virtual table pointer of a to b.

Is what I said correct? How can I fix this problem?

curiousguy
  • 8,038
  • 2
  • 40
  • 58
Stoatman
  • 758
  • 3
  • 9
  • 22
  • 7
    You cannot serialize blindly as you do. – Jarod42 Jul 03 '17 at 19:42
  • 3
    Related: https://stackoverflow.com/a/4550549/214671 . Short answer: don't do this. You can serialize/deserialize an object brutally like that only if it's a POD type which doesn't contain pointers - and this is definitely not POD. You may get away with this is you separate the POD part of your object into a separate structure, and serialize/deserialize only that. – Matteo Italia Jul 03 '17 at 19:43
  • 1
    When you want to save a type with virtual functions, DONT! Never save any pointers to files. You should use enums to differ from different subclasses... and load the other values seperately. – cmdLP Jul 03 '17 at 19:46
  • Actually, I just want to use a couple of POD members in that object (not pointers). I am aware that saving pointer to file and reading it back is silly. However, what I am trying to understand now is why the __vfptr changed? And could I keep doing this if I won't touch the pointers? – Stoatman Jul 03 '17 at 19:53
  • 3
    See the [requirements for a POD type](http://en.cppreference.com/w/cpp/concept/PODType). If your object does not meet these requirements, you cannot simply set it's representation in memory directly as it's undefined behavior. Once you're in undefined behavior, you cannot rationalize anything. If your members are POD, then feel free to deserialize only those members (with direct pointers to them). However, I feel there is another problem that is being overlooked, which is that you are deserializing to `&a`, which is to say you are writing to the pointer `a`, not the object pointer to by `a`. – François Andrieux Jul 03 '17 at 19:57
  • @FrançoisAndrieux: Your first sentence cleared my mind. For the latter problem that you pointed out, it's just my typing mistake. I'm sorry. – Stoatman Jul 03 '17 at 20:02

1 Answers1

2

Is what I said correct?

No, it's not correct. The source of the problem is that you are merely writing addresses to a file and loading them back (additionally, with wrong sizes used).

file.write((char*)&a, sizeof(B));

The preceding line writes the pointer that was stored in the variable a with the size of class B to the file.

Pointers cannot be reconstructed from a file, since they need to be memory-managed (dynamic allocation, in your case).

So the statement

thatSameFile.read((char*)&b, sizeof(B));

just overwrites the pointer stored in b with some arbitrary, meaningless value, plus some additional bytes on the stack. This is basically undefined behavior.

As for your comment, this was a typo; it wouldn't change much about what I wrote above. Pointers cannot be reconstructed from files.


How can I fix this problem?

If you need to write binary images of your structs / classes. You can do so for plain POD types like

struct Foo {
    char c;
    int i;
    double d;
    long arrlong[25];
};

that contain only integral types, or fixed size arrays of integral types.

Such types could be written "safely" to and restored from a binary file for the same target architecture (see Endianness):

Foo a;

file.write((const char*)&a, sizeof(Foo));

// ...

Foo b;
thatSameFile.read((char*)&b, sizeof(Foo));

Also you cannot use types with virtual polymorphic inheritance for doing so. Just reloading a vtable (which isn't even specified by the C++ standard) isn't enough to tell the runtime what's actually the underlying type safely.


You should lookup serialization/deserialization to achieve what you want. There are several libraries that support binary formats well, like boost::serialization or google protocol buffers, which help you to build something more sophisticated than POD serialization/deserialization.

user0042
  • 7,917
  • 3
  • 24
  • 39