3

If you have an object instance A that references other objects (for example instances B and C), and you binary serialize A to a file, what happens? Do you now have serialized data that includes A, B and C?

How does it work exactly? What will I get if I deserialize the data? A, B, and C??

(Feel free to include internal workings explanations as well).

richard
  • 12,263
  • 23
  • 95
  • 151

3 Answers3

8

All of the references to other objects will be serialized as well. If you deserialize the data, you will end up with a complete, working set of its data, including objects A, B, and C. That's probably the primary benefit of binary serialization, as opposed to XML serialization.

If any of the other classes your object holds a reference to are not marked with the [Serializable] attribute, you'll get a SerializationException at run-time (the image of which was shamelessly stolen from the web; run-time errors don't even look like this anymore in the current versions of VS):

    Example of an unhandled SerializationException

Further than that, I'm not really sure what "internal things" you were hoping to understand. Serialization uses reflection to walk through the public and private fields of objects, converting them to a stream of bytes, which are ultimately written out to a data stream. During deserialization, the inverse happens: a stream of bytes is read in from the data stream, which is used to synthesize an exact replicate of the object, along with type information. All of the fields in the object have the same values that they held before; the constructor is not called when an object is deserialized. The easiest way to think about it is that you're simply taking a snapshot-in-place of the object, that you can restore to its original state at will.

The class that is responsible for the actual serialization and deserialization is called a formatter (it always inherits from the IFormatter interface). It's job is to generate an "object graph", which is a generalized tree containing the object that is being serialized/deserialized as its root. As mentioned above, the formatter uses reflection to walk through this object graph, serializing/deserializing all object references contained by that object. The formatter is also intelligent enough to know not to serialize any object in the graph more than once. If two object references actually point to the same object, this will be detected and that object will only be serialized once. This and other logic prevents entering an infinite loop.

Of course, it's easy to have a good general understanding of how this process works. It's much harder to actually write the code that implements it yourself. Fortunately, that's already been done for you. Part of the point of the .NET Framework is that all this complicated serialization logic is built in, leaving you free from worrying about it. I don't claim to understand all of it myself, and you certainly don't need to either to take full advantage of the functionality it offers. Years of writing all that code by hand are finally over. You should be rejoicing, rather than worrying about implementation details. :-)

Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574
  • Thanks Cody, exactly what I was looking for. The reason I ask for the interal workings of things is that I don't really understand and get the "ah ha" unless I do. I have to visualize everything in order to get it, and remember it. Thanks again for the great answer. – richard Feb 05 '11 at 09:02
  • @Richard: Of course. I understand how that goes, because I'm much the same way myself. I wasn't so much trying to get across that you shouldn't try to understand it. Rather that "it gets incredibly complicated from here", and that little bit of extra knowledge isn't worth understanding to have an appreciation of how it works. Anyway, you're welcome. – Cody Gray - on strike Feb 05 '11 at 09:05
4

Firstly, object A's type must be tagged with the [Serializable] attribute. Serializing A will serialize all its member data, private or public, provided the members' types are also tagged with [Serializable] (or to use your example, provided that B and C's types are marked [Serializable]). Attempts to serialize data, directly or indirectly, of a type that is not [Serializable] will result in an exception.

A number of the built-in .NET types are already marked as [Serializable], including System.Int32 (int), System.Boolean (bool), etc.

You can read more about .NET serialization here: http://msdn.microsoft.com/en-us/library/4abbf6k0.aspx.

DuckMaestro
  • 15,232
  • 11
  • 67
  • 85
1

The objects referred by the main object has to be [Serializable] as well. Providing so all is done automatically by the formatter.

Felice Pollano
  • 32,832
  • 9
  • 75
  • 115
  • ..."all is done".. My question is about that part, what is the "all" that is done? LOL – richard Feb 05 '11 at 08:06
  • 2
    @Richard: He means that all of the references to other objects will be serialized as well. If you deserialize the data, you will end up with a complete, working set of a data, including A, B, and C. That's probably the primary benefit of *binary* serialization. If those other classes are not marked `[Serializable]`, you'll get an exception. – Cody Gray - on strike Feb 05 '11 at 08:09
  • @Cody: The answer plus your comment gave me the answer. Thanks! – richard Feb 05 '11 at 08:13
  • Although to be honest, if someone gave me some of the interal things that were happening, that would be great. – richard Feb 05 '11 at 08:14
  • @Richard: I started trying to answer that with another comment, but quickly realized it should be an answer instead. So I've combined my previous comment with some new information on the "internals", and posted an answer. Hopefully that clarifies this for you. – Cody Gray - on strike Feb 05 '11 at 08:54
  • @Richard sorry for my to short response, anyway I think you can have a complete answer by using the POM – Felice Pollano Feb 05 '11 at 10:34