Detecting that a Serializable class has renamed fields/properties

Question

When you have a [Serializable] class, some changes e.g. re-naming properties or fields can mean you lose data when de-serialising in the new version.

I'm searching for an automated way to detect these changes, in order to allow manual inspection of what's going on. So something that can be given two .Net assemblies, find the Serializable classes in it and detect potentially breaking changes.

If it can ignore things that are generally not a problem, e.g. adding fields/properties, that's a bonus, but not essential.

An example of the kind of refactoring that is fine for non-serializable classes, but causes major issues for serializable ones:

// version 1

[Serializable]
class A 
{
    private string _name;
    public string Name 
    { 
       get { return _name;} 
       set {_name = value;}
    }
}

// version 2

[Serializable]
class A 
{
    public string Name { get; set;}
}

I'd love a tool here that said something along the lines of

Class A - missing field _name
Class A - new property Name

Things I've considered so far:

Find all classes with the [Serializable] attribute, create an instance and serialize it.
- Aside from issues where there isn't a default constructor, this feels like it would struggle when fields/properties are null when first created.
Use reflection to find fields and their types within each [Serializable] class, and produce an alphabetical list.
- Won't handle custom serialization, and it feels the wrong approach

(In case the serializer itself matters - we currently use our own json-based serializer that inherits from System.Runtime.Serialization.Formatter, so it behaves the same way the BinarySerializer does)

(edit) Check whether binary serialized data matches the class which serialized it is a similar question - seems like there isn't a simple answer yet

Why would the serializer know anything about `_name`? Or believe the two `Name` properties are different? What you are asking about is a code-analyzer, not something with serialization. To a serializer, your two classes are the same. — Ron Beyer, Apr 06 '18 at 14:42
@RonBeyer hmm.. although if `Formatter` is the base-class of `BinaryFormatter`, it could *still* be field-based! — Marc Gravell, Apr 06 '18 at 14:45
@MarcGravell I'd agree, a detail we can't know without seeing the implementation of the "our own json-based serializer" although I would be surprised if it derived from `BinaryFormatter`. It can't be "serializer independent" though, some serialize class structure along with data while some, like most JSON-based serializers, just serialize data in a structure-agnostic format. — Ron Beyer, Apr 06 '18 at 14:46
@Stuart it feels like you're trying to solve the wrong problem, if I'm being honest. There's a reason that most serializers *don't work at the field level*. Pretty much every off-the-shelf json serializer would *get this right* without you having to do anything; so would many binary serializers (pb-net, etc). To be honest, a better use of your time might be simply : moving away from this field-based custom formatter. — Marc Gravell, Apr 06 '18 at 14:48
@MarcGravell but essence of this problem remains even with better serializer. OP might have provided better example, such as Name being renamed to Title or something like that. — Evk, Apr 06 '18 at 15:16
@Evk and *in that scenario*: most serializers have a mechanism to handle it, usually by adding attributes. But at that point: if you're changing the public API of exchange types / DTOs - you already **know** that you should expect pain; it is inherently a breaking change. The change to *fields* is much more subtle. — Marc Gravell, Apr 06 '18 at 15:21
Moving away from the custom serializer is something I really want to do long term, but I want something short term that detects these problems. As I said, our formatter is a subclass of the C# Formatter library, so I assume that all of these problems will apply to BinaryFormatter and any other formatter that derives from it - in that it uses the private fields rather than the public interface. — Stuart Moore, Apr 06 '18 at 15:29
Re different renames - what I really want is a way to detect these breaking changes. Adding fields, renames, whatever - so they can be reviewed specially. (Yes, as soon as I see it I know it's going to cause pain, but sadly not every developer realises that and it can get missed in code review.) — Stuart Moore, Apr 06 '18 at 15:39

Denis Kucherov · Answer 1 · 2018-04-06T15:29:06.943

0

What if someone remove the string that checking value for 0 before dividing? We test it with UnitTest.

So why don't use the same method? You can just test it or create some tool to collect reference data. The tool creates xml output file whit all Serializable objects. Maybe add some timestamp, etc. Something like this:

<ObjectName Timestamp="{CreatedDate}">
  <Field Type="int" Name="Id" />
  <Field Type="string" Name="Address" />
</ObjectName>
...

Then the UnitTest(s) use this file to verify assemblies through Reflection. Having this information we can detect changes in fields, removed and new fields.

edited Apr 06 '18 at 15:29

answered Apr 06 '18 at 15:23

Denis Kucherov

369
3
15

Thanks, I think my issue is that I don't have an easy way to create an example "serializable object" of each type. I was hoping that this was a solved problem and that someone else had written a tool, rather than writing my own. – Stuart Moore Apr 06 '18 at 15:31

Detecting that a Serializable class has renamed fields/properties

1 Answers1