0

I am calling a web service that returns JSON with a duplicate node in some circumstances, providing output similar to this:

{
    "shipments": [
        {
            "id": "A000001",
            "name": "20141208 140652",
            "type": "OUTLET",
            "date": "2014-12-08 14:06:52",
            "status": "SENT",
            "received_at": null,
            "created_at": "2014-12-08 14:06:52",
            "updated_at": null,
            "outlet_id": "SH000064"
        },
        {
            "id": "A000002",
            "name": "20141204 122650",
            "type": "SUPPLIER",
            "date": "2014-12-04 12:26:50",
            "outlet_id": "SH000064",
            "supplier_id": null,
            "status": "RECEIVED",
            "outlet_id": "SH000064",
            "received_at": "2014-12-04 12:28:43",
            "created_at": "2014-12-04 12:26:50",
            "updated_at": "2014-12-04 12:28:43"
        }
    ]
}

I am dependent on the provider of the service to fix this and this is not a priority for them so I have to deal with it. To handle this I am converting the JSON to XML, using the JsonReaderWriterFactory, and then removing the duplicate nodes from the resulting XML using the following routine:

protected virtual void RemoveDuplicateChildren(XmlNode node)
{
    if (node.NodeType != XmlNodeType.Element || !node.HasChildNodes)
    {
        return;
    }

    var xNode = XElement.Load(node.CreateNavigator().ReadSubtree());
    var duplicateNames = new List<string>();

    foreach (XmlNode child in node.ChildNodes)
    {
        var isBottom = this.IsBottomElement(child); // Has no XmlNodeType.Element type children

        if (!isBottom)
        {
            this.RemoveDuplicateChildren(child);
        }
        else
        {
            var count = xNode.Elements(child.Name).Count();

            if (count > 1 && !duplicateNames.Contains(child.Name))
            {
                duplicateNames.Add(child.Name);
            }
        }
    }

    if (duplicateNames.Count > 0)
    {
        foreach (var duplicate in duplicateNames)
        {
            var nodeList =  node.SelectNodes(duplicate);

            if (nodeList.Count > 1)
            {
                for (int i=1; i<nodeList.Count; i++)
                {
                    node.RemoveChild(nodeList[i]);
                 }
             }
        }
    }
}

I now in a separate area need to use the DataContractJsonSerializer to deserialise the JSON to a strongly typed object, using the following code:

DataContractJsonSerializer serialiser = new DataContractJsonSerializer(typeof(ShipmentList));
MemoryStream stream = new MemoryStream(Encoding.UTF8.GetBytes(json));

var result = serialiser.ReadObject(stream);

This fails when the JSON contains that duplicate node, so I need to implement the same functionality as in the RemoveDuplicateChildren method but stepping through the JSON instead of an XML node, before the deserialisation. I can't use the quick-and-dirty option of using JsonConvert to convert to XML, removing the node with my existing method, then converting back to JSON because of the changes in the JSON that will result from the conversion to and from XML. Is there an equivalent way of navigating through a JSON hierarchy in C# as is provided by the XmlNode class?

UPDATE:

This question has become obfuscated by some of the comments. To clarify, the nodes I want to remove from the JSON are any nodes that are a repeat (by name, the content is irrelevant) at the same level of the same parent, such as the second "outlet_id" of the second "shipments" item in the example above. I need to do this in a generic way without hard coded element names. The RemoveDuplicateChildren method above does exactly what is needed, I'm just asking if there is a class I can use to do exactly the same as that method on a JSON string instead of an XML string.

dbc
  • 104,963
  • 20
  • 228
  • 340
Valerie Metcalf
  • 91
  • 2
  • 14
  • 2
    I'm confused... where's the duplicate node? – Brian Driscoll Dec 16 '14 at 14:47
  • What causes the error on the duplicate node? Is it because `ShipmentList` doesn't allow duplicates? Sounds like you need to have the `ShipmentList` deserializer read everything into a list that allows duplicates, and then in the `OnDeserialized` method remove the duplicates. You should show us the relevant `ShipmentList` code. – Jim Mischel Dec 16 '14 at 14:49
  • I guess he just meant to say the Outlet_id repeated twice in the second set of data – Xavier Dec 16 '14 at 14:50
  • It's also unclear what you mean by "This fails" - what goes wrong? Is it definitely the deserialization that fails? – Jon Skeet Dec 16 '14 at 14:54
  • The duplicate is the second outlet_id in the second shipment, as Xavier says. It is the deserialisation that fails. The ReadObject method throws a SerializationException and the message is "The data contract type 'Shipment' cannot be deserialized because the data member 'outlet_id' was found more than once in the input." – Valerie Metcalf Dec 16 '14 at 15:05
  • It sounds like the way you're deserializing is broken, then. You've got two separate objects which happen to use the same `outlet_id`. That sounds entirely reasonable to me - why would it not be valid? Can no two objects have the same status, or type? I can see why `id` should be unique, but that's all. – Jon Skeet Dec 30 '14 at 16:54
  • They're the same objects, both Shipment objects. There is nothing wrong with the deserialisation, it works with valid JSON whatever properties are included or not, but cannot deserialise with duplicate nodes. That's why I need to remove the duplicates before attempting to deserialise. – Valerie Metcalf Jan 02 '15 at 11:01
  • Note to others following this question: I've deleted my answer as it didn't satisfy the OP. The sample JSON really looks like two shipments (different shipment IDs, different statuses etc) which happen to come from the same outlet to me, so to my mind they're not duplicates. Having tried to discuss this with the OP, neither of us is persuading the other of anything, so we'll see if anyone else has any luck. There's little point in me writing code that tries to satisfy what I consider to be a broken requirement. – Jon Skeet Jan 02 '15 at 11:41
  • The JSON is two different shipments. That's perfectly valid and not the problem. The problem is the two identical outlet_id properties of the second shipment. Because this is invalid JSON the deserialisation throws an exception (see my comment above). In order to deserialise the JSON I need to first remove all the duplicate nodes, exactly as I am doing with the XML. – Valerie Metcalf Jan 02 '15 at 15:59

0 Answers0