11

I have a situation in which I have a very large C# object, however, I only need to return a handful of properties (which can be on nested objects), allow for client-side JavaScript to modify those properties and then send the resulting object back to the server in order to perform in-place partial de-serialization.

The idea is to re-use some very large existing business objects, but be intelligent about only serializing and sending only those properties back to the client application for modification (to keep the amount of data transferred at a minimum).

I basically have an XML file where I pre-define all of the bindings using a "path syntax" which would indicate only those properties I need to serialize. So, I could use something like "WorkOrder.UserField1" or "WorkOrder.Client.Name".

I have tried using a custom contract resolver to determine whether or not a property should be serialized; however, it doesn't seem that I have information as to the "path" (in other words, other properties in the object model up the chain) in order to determine if the property should or should not be serialized.

I have also tried using a custom JsonTextWriter, but it doesn't seem that I can override the methods necessary to keep track of the path, even though there is a Path property available. Is there something perhaps simple that I am overlooking in order to be able to view the path hierarchy of a property being serialized and determine if it should be serialized by looking up the path in a table and making the decision?

JFalcon
  • 179
  • 1
  • 10
  • Are you looking to prune properties from your class hierarchy in a context-independent or context-dependent manner? I.e. if you have a class `Bar`, and a `class Foo { public Bar Bar1 { get; set; } public Bar Bar2 { get; set; } }`, is there a chance you will want difference subsets of properties for `Bar1` and `Bar2`? – dbc May 18 '15 at 22:19
  • Would you be willing to serialize the entire object, then filter out the undesired properties? That's easy, and meets your requirement to reduce the amount of data transferred. – dbc May 19 '15 at 04:35
  • 1
    @dbc it is going to be context dependent. Basically, I am pre-defining a form "template" in which I would generate HTML as well as data-bindings to the object properties. So, I will collect all of the unique paths from this template, but it is possible that two paths may have the same object type, but only use a different subset of properties. – JFalcon May 19 '15 at 15:13
  • @dbc, I can deal with serializing the entire object, if there is an easy way to pick off only the values that match the "property" paths I am interested in after serialization. – JFalcon May 19 '15 at 16:13

4 Answers4

14

The basic difficulty here is that Json.NET is a contract-based serializer which creates a contract for each type to be serialized, then (de)serializes according to the contract. If a type appears in multiple locations in the object hierarchy, the same contract applies. But you want to selectively include properties for a given type depending on its location in the hierarchy, which conflicts with the basic "one type one contract" design.

One quick way to work around this is to serialize to a JObject, then use JToken.SelectTokens() to select only the JSON data you want to return, removing everything else. Since SelectTokens has full support for JSONPath query syntax, you can selectively include using array and property wildcards or other filters, for instance:

"$.FirstLevel[*].Bar"

includes all properties named "Bar" in all array members of a property named "FirstLevel" of the root object.

This should reduce your network usage as desired, but won't save any processing time on the server.

Removal can be accomplished with the following extension methods:

public static partial class JsonExtensions
{
    public static TJToken RemoveAllExcept<TJToken>(this TJToken obj, IEnumerable<string> paths) where TJToken : JToken
    {
        if (obj == null || paths == null)
            throw new NullReferenceException();
        var keepers = new HashSet<JToken>(paths.SelectMany(path => obj.SelectTokens(path)), ObjectReferenceEqualityComparer<JToken>.Default);

        var keepersAndParents = new HashSet<JToken>(keepers.SelectMany(t => t.AncestorsAndSelf()), ObjectReferenceEqualityComparer<JToken>.Default);
        // Keep any token that is a keeper, or a child of a keeper, or a parent of a keeper
        // I.e. if you have a path ""$.A.B" and it turns out that B is an object, then everything
        // under B should be kept.
        foreach (var token in obj.DescendantsAndSelfReversed().Where(t => !keepersAndParents.Contains(t) && !t.AncestorsAndSelf().Any(p => keepers.Contains(p))))
            token.RemoveFromLowestPossibleParent();

        // Return the object itself for fluent style programming.
        return obj;
    }

    public static string SerializeAndSelectTokens<T>(T root, string[] paths, Formatting formatting = Formatting.None, JsonSerializerSettings settings = null)
    {
        var obj = JObject.FromObject(root, JsonSerializer.CreateDefault(settings));

        obj.RemoveAllExcept(paths);

        var json = obj.ToString(formatting);

        return json;
    }

    public static TJToken RemoveFromLowestPossibleParent<TJToken>(this TJToken node) where TJToken : JToken
    {
        if (node == null)
            return null;
        JToken toRemove;
        var property = node.Parent as JProperty;
        if (property != null)
        {
            // Also detach the node from its immediate containing property -- Remove() does not do this even though it seems like it should
            toRemove = property;
            property.Value = null;
        }
        else
        {
            toRemove = node;
        }
        if (toRemove.Parent != null)
            toRemove.Remove();
        return node;
    }

    public static IEnumerable<JToken> DescendantsAndSelfReversed(this JToken node)
    {
        if (node == null)
            throw new ArgumentNullException();
        return RecursiveEnumerableExtensions.Traverse(node, t => ListReversed(t as JContainer));
    }

    // Iterate backwards through a list without throwing an exception if the list is modified.
    static IEnumerable<T> ListReversed<T>(this IList<T> list)
    {
        if (list == null)
            yield break;
        for (int i = list.Count - 1; i >= 0; i--)
            yield return list[i];
    }
}

public static partial class RecursiveEnumerableExtensions
{
    // Rewritten from the answer by Eric Lippert https://stackoverflow.com/users/88656/eric-lippert
    // to "Efficient graph traversal with LINQ - eliminating recursion" http://stackoverflow.com/questions/10253161/efficient-graph-traversal-with-linq-eliminating-recursion
    // to ensure items are returned in the order they are encountered.

    public static IEnumerable<T> Traverse<T>(
        T root,
        Func<T, IEnumerable<T>> children)
    {
        yield return root;

        var stack = new Stack<IEnumerator<T>>();
        try
        {
            stack.Push((children(root) ?? Enumerable.Empty<T>()).GetEnumerator());

            while (stack.Count != 0)
            {
                var enumerator = stack.Peek();
                if (!enumerator.MoveNext())
                {
                    stack.Pop();
                    enumerator.Dispose();
                }
                else
                {
                    yield return enumerator.Current;
                    stack.Push((children(enumerator.Current) ?? Enumerable.Empty<T>()).GetEnumerator());
                }
            }
        }
        finally
        {
            foreach (var enumerator in stack)
                enumerator.Dispose();
        }
    }
}

/// <summary>
/// A generic object comparerer that would only use object's reference, 
/// ignoring any <see cref="IEquatable{T}"/> or <see cref="object.Equals(object)"/>  overrides.
/// </summary>
public class ObjectReferenceEqualityComparer<T> : IEqualityComparer<T> where T : class
{
    // Adapted from this answer https://stackoverflow.com/a/1890230
    // to https://stackoverflow.com/questions/1890058/iequalitycomparert-that-uses-referenceequals
    // By https://stackoverflow.com/users/177275/yurik
    private static readonly IEqualityComparer<T> _defaultComparer;

    static ObjectReferenceEqualityComparer() { _defaultComparer = new ObjectReferenceEqualityComparer<T>(); }

    public static IEqualityComparer<T> Default { get { return _defaultComparer; } }

    #region IEqualityComparer<T> Members

    public bool Equals(T x, T y)
    {
        return ReferenceEquals(x, y);
    }

    public int GetHashCode(T obj)
    {
        return System.Runtime.CompilerServices.RuntimeHelpers.GetHashCode(obj);
    }

    #endregion
}

And then use them like:

public class TestClass
{
    public static void Test()
    {
        var root = new RootObject
        {
            FirstLevel1 = new FirstLevel
            {
                SecondLevel1 = new List<SecondLevel> { new SecondLevel { A = "a11", B = "b11", Third1 = new ThirdLevel { Foo = "Foos11", Bar = "Bars11" }, Third2 = new List<ThirdLevel> { new ThirdLevel { Foo = "FooList11", Bar = "BarList11" } } } },
                SecondLevel2 = new List<SecondLevel> { new SecondLevel { A = "a12", B = "b12", Third1 = new ThirdLevel { Foo = "Foos12", Bar = "Bars12" }, Third2 = new List<ThirdLevel> { new ThirdLevel { Foo = "FooList12", Bar = "BarList12" } } } },
            },
            FirstLevel2 = new FirstLevel
            {
                SecondLevel1 = new List<SecondLevel> { new SecondLevel { A = "a21", B = "b21", Third1 = new ThirdLevel { Foo = "Foos21", Bar = "Bars21" }, Third2 = new List<ThirdLevel> { new ThirdLevel { Foo = "FooList21", Bar = "BarList21" } } } },
                SecondLevel2 = new List<SecondLevel> { new SecondLevel { A = "a22", B = "b22", Third1 = new ThirdLevel { Foo = "Foos22", Bar = "Bars22" }, Third2 = new List<ThirdLevel> { new ThirdLevel { Foo = "FooList22", Bar = "BarList22" } } } },
            }
        };

        Assert.IsTrue(JObject.FromObject(root).DescendantsAndSelf().OfType<JValue>().Count() == 24); // No assert

        var paths1 = new string[] 
        {
            "$.FirstLevel2.SecondLevel1[*].A",
            "$.FirstLevel1.SecondLevel2[*].Third2[*].Bar",
        };

        Test(root, paths1, 2);

        var paths3 = new string[] 
        {
            "$.FirstLevel1.SecondLevel2[*].Third2[*].Bar",
        };

        Test(root, paths3, 1);

        var paths4 = new string[] 
        {
            "$.*.SecondLevel2[*].Third2[*].Bar",
        };

        Test(root, paths4, 2);
    }

    static void Test<T>(T root, string [] paths, int expectedCount)
    {
        var json = JsonExtensions.SerializeAndSelectTokens(root, paths, Formatting.Indented);
        Console.WriteLine("Result using paths: {0}", JsonConvert.SerializeObject(paths));
        Console.WriteLine(json);
        Assert.IsTrue(JObject.Parse(json).DescendantsAndSelf().OfType<JValue>().Count() == expectedCount); // No assert
    }
}

public class ThirdLevel
{
    public string Foo { get; set; }
    public string Bar { get; set; }
}

public class SecondLevel
{
    public ThirdLevel Third1 { get; set; }
    public List<ThirdLevel> Third2 { get; set; }

    public string A { get; set; }
    public string B { get; set; }
}

public class FirstLevel
{
    public List<SecondLevel> SecondLevel1 { get; set; }
    public List<SecondLevel> SecondLevel2 { get; set; }
}

public class RootObject
{
    public FirstLevel FirstLevel1 { get; set; }
    public FirstLevel FirstLevel2 { get; set; }
}

Note that there is an enhancement request Feature request: ADD JsonProperty.ShouldSerialize(object target, string path) #1857 that would enable this sort of functionality more easily.

Demo fiddles here and here.

dbc
  • 104,963
  • 20
  • 228
  • 340
  • 1
    This looks like it may accomplish what I'm trying to do. I didn't even think of the JObject approach and didn't know there was JSON Path wildcard syntax! (Thank you for the enlightenment.) In my case, there isn't a need to use wildcards, all the paths are explicit. (i.e. "p1.p2.p3", which each property separated by periods being the parent object hierarchy, with "p3" ultimately being a simple type (like string, int, etc.) instead of an object. Let me wrap my head around this and if it works for what I need, I'll owe you an 'answer' acknowledgement and a beer! – JFalcon May 19 '15 at 20:06
  • This is very well done thanks. Now I have to write tests for it :) – Seth Nov 19 '15 at 22:30
  • Hi, Thanks for the excellent code. The only issue I am facing is if there are duplicate attributes, say, all A's are "a11", and if I give the path for A, it gives only the first occurrence. keepers get the unique values and the keepersandparents have only the first path since it is not checking for where the same value is appearing again. If you can show me how to solve this, it will be of a great help – narasimman Jun 26 '19 at 17:16
1

The much easier implementation (comparing to the accepted answer) is presented here:

public static class JsonExtensions
{
    public static TJToken RemoveAllExcept<TJToken>(this TJToken token, IEnumerable<string> paths) where TJToken : JContainer
    {
        HashSet<JToken> nodesToRemove = new(ReferenceEqualityComparer.Instance);
        HashSet<JToken> nodesToKeep = new(ReferenceEqualityComparer.Instance);

        foreach (var whitelistedToken in paths.SelectMany(token.SelectTokens))
            TraverseTokenPath(whitelistedToken, nodesToRemove, nodesToKeep);

        //In that case neither path from paths has returned any token
        if (nodesToKeep.Count == 0)
        {
            token.RemoveAll();
            return token;
        }

        nodesToRemove.ExceptWith(nodesToKeep);

        foreach (var notWhitelistedNode in nodesToRemove)
            notWhitelistedNode.Remove();

        return token;
    }

    private static void TraverseTokenPath(JToken value, ISet<JToken> nodesToRemove, ISet<JToken> nodesToKeep)
    {
        JToken? immediateValue = value;

        do
        {
            nodesToKeep.Add(immediateValue);

            if (immediateValue.Parent is JObject or JArray)
            {
                foreach (var child in immediateValue.Parent.Children())
                    if (!ReferenceEqualityComparer.Instance.Equals(child, value))
                        nodesToRemove.Add(child);
            }

            immediateValue = immediateValue.Parent;
        } while (immediateValue != null);
    }
}
ademchenko
  • 585
  • 5
  • 18
0

For most cases this can be achieved by a simple single line extension method

public static string ToJson<T>(this T self, string path) => $@"{{""{path}"":{JObject.FromObject(self)[path]?.ToString(Formatting.None)}}}";

This is only valid for extracting an object nested under the root object but is easily adapted with a separate parameter to specify the output path if needed

SteveHayles
  • 168
  • 1
  • 7
0

Thanks to @dbc answer as a good solution, but like he said, it doesn't affect the performance. Sometimes the data loaded from database has numerous references and only ignoring ReferenceLoopHandling is not enough for serialization; hence the serialized data becomes very large and takes a lot of ram in server, and this is caused by repetition of serializing a single object. In this situation, it's better to make a limited jobject from data straightly, rather than making a jobject and then exclude the unwanted paths from it. This can be done with a little customization of database pure data and a ContractResolver. Let's assume all the database entities inherit from a class or interface like DbModel (this is necessary in this solution). Then by a special ContractResolver, serialization of objects can be limited. A sample is like below:

class TypeName
{
    public Type Type { get; set; }
    public string Name { get; set; }
}

class MyContractResolver : DefaultContractResolver
{
    private List<List<TypeName>> allTypeNames = new List<List<TypeName>>();
    public MyContractResolver(Type parentType, string[] includePaths)
    {
        foreach (var includePath in includePaths)
        {
            List<TypeName> typeNames =  new List<TypeName>() { new TypeName() { Type = parentType } };
            var pathChilderen = includePath.Split('.');

            for(int i = 0; i < pathChilderen.Length; i++)
            {
                var propType = typeNames[i].Type.GetProperties().FirstOrDefault(c => c.Name == pathChilderen[i]).PropertyType;
                if (propType.GetInterface(nameof(IEnumerable)) != null && propType != typeof(String))
                {
                    propType = propType.GetGenericArguments().Single();
                }
                typeNames.Add(new TypeName() { Name = pathChilderen[i], Type = propType });
            }

            allTypeNames.Add(typeNames);
        }
    }

    protected override IList<JsonProperty> CreateProperties(Type type, MemberSerialization memberSerialization)
    {
        IList<JsonProperty> properties = base.CreateProperties(type, memberSerialization);

        // only serializer properties that are in include paths

        List<JsonProperty> excludeProperties = new List<JsonProperty>();
        foreach (var property in properties)
        {
            if (typeof(DbModel).IsAssignableFrom(property.PropertyType) || (property.PropertyType.GetInterface(nameof(IEnumerable)) != null && property.PropertyType != typeof(String)))
            {
                Console.WriteLine(property.PropertyType.ToString());
                var exclude = true;

                foreach (var typeNames in allTypeNames)
                {
                    var index = typeNames.FindIndex(c => c.Name == property.PropertyName && c.Type == property.PropertyType);
                    if (index > 0)
                    {
                        if (typeNames[index - 1].Type == type) 
                        { 
                            exclude = false;
                            goto EndSearch;
                        }
                    }
                }

            EndSearch:
                if (exclude)
                    excludeProperties.Add(property);
            }
        }
        properties = properties.Where(c => excludeProperties.All(d => d.PropertyName != c.PropertyName)).ToList();

        return properties;
    }
}

This class can be used like this:

// return Ok(data);
var jObject = JObject.FromObject(data,
    JsonSerializer.CreateDefault(new JsonSerializerSettings()
    {
        ReferenceLoopHandling = ReferenceLoopHandling.Ignore,
        Converters = new List<JsonConverter>()
        {
            new ValidationProblemDetailsConverter(),
            new ProblemDetailsConverter(),
            new StringEnumConverter()
        },
        ContractResolver = new MyContractResolver(typeof(Foo), new[] { "bar", "baz.qux" })
    }));
return Ok(jObject);

In this example Foo is the class of main object to return, and bar and baz are properties that are going to be serialized (they are loaded from database too). In addition qux is one of the baz properties that is loaded from database and has to be serialized. In this example all the other properties of each model that are not entities of database (so are not inherited from DbModel) are serialized and all the entities of database that exist in original data but not in the including paths, are ignored to be serialized.

masoud
  • 116
  • 1
  • 1
  • 10