2

Situation

This is to use on our platform, https://dashdash.com - a spreadsheet with integrations for known web services (and your private APIs too).

Particular case (testable)

Consider the following source JSON, I want to return only the array elements that have nested objects B, C and G. G is on a different depth than B and C.

Below you can see the source and 2 options for the return.

source JSON

[  
   {  
      "A":"val1",
      "B":"val2",
      "C":"val3",
      "D":{  
         "E":[  
            {  
               "F":"val4"
            }
         ],
         "G":[  
            {  
               "H":"val5",
               "I":"val6",
               "J":"val7"
            }
         ]
      }
   },
   {  
      "A":"val8",
      "B":"val9",
      "C":"val10",
      "D":{  
         "E":[  
            {  
               "F":"val11"
            }
         ],
         "G":[  
            {  
               "H":"val12",
               "I":"val13",
               "J":"val14"
            }
         ]
      }
   },
   {  
      "A":"val15",
      "B":"val16"
   },
   {  
      "A":"val8",
      "B":"val9",
      "C":"val10",
      "D":{  
         "E":[  
            {  
               "F":"val11"
            }
         ]
      }
   }
]

Expected return Option 1.

[
   {
      "B":"val2",
      "C":"val3",
      "G":[
         {
            "H":"val5",
            "I":"val6",
            "J":"val7"
         }
      ]
   },
   {
      "B":"val9",
      "C":"val10",
      "G":[
         {
            "H":"val12",
            "I":"val13",
            "J":"val14"
         }
      ]
   }
]

Expected return Option 2.

[
   {
      "B":"val2",
      "C":"val3",
      "D":{
         "E":[
            {
               "F":"val4"
            }
         ],
         "G":[
            {
               "H":"val5",
               "I":"val6",
               "J":"val7"
            }
         ]
      }
   },
   {
      "B":"val9",
      "C":"val10",
      "D":{
         "E":[
            {
               "F":"val11"
            }
         ],
         "G":[
            {
               "H":"val12",
               "I":"val13",
               "J":"val14"
            }
         ]
      }
   }
]

Where I am

  • I can extract all the array elements that have B,C and D, with the query $..['B','C','D']

I have tried to extract B, C and G, but all the following queries fail:

  • $..['B','C','G']: returns null.
  • $..['B','C',['D'].['G']]: returns only the objects inside G.

Again, I'm using JayWay JsonPath https://github.com/json-path/JsonPath, and my code works exactly like https://jsonpath.herokuapp.com.

Thanks in advance

Humberto
  • 99
  • 12

2 Answers2

3

You can solve this problem setting the JayWay to DEFAULT_PATH_LEAF_TO_NULL configuration (as decribed on oficial documentation: https://github.com/json-path/JsonPath) and after this apply a null comparation evaluation:

like this:

$.[?(@.A != null && @.B != null && @.D != null &&  @.D.G != null)]

or this:

$.[?((@.A != null && @.B != null) && ((@.D != null &&  @.D.G != null) || (@.G != null)))]

For set DEFAULT_PATH_LEAF_TO_NULL you should change you default configuration:

Configuration conf = Configuration.defaultConfiguration();
Configuration conf2 = conf.addOptions(Option.DEFAULT_PATH_LEAF_TO_NULL);

Note: If you are using a legacy version of the jayway the comparison operator could not work correctly, to get more information see https://code.google.com/archive/p/json-path/issues/27

I tested this solution and worked fine for me:

Test did on https://jsonpath.herokuapp.com/ with the following input:

[  
   {  
      "A":"val1",
      "B":"val2",
      "C":"val3",
      "D":{  
         "E":[  
            {  
               "F":"val4"
            }
         ],
         "G":[  
            {  
               "H":"val5",
               "I":"val6",
               "J":"val7"
            }
         ]
      }
   },
   {  
      "A":"val8",
      "B":"val9",
      "C":"val10",
      "D":{  
         "E":[  
            {  
               "F":"val11"
            }
         ],
         "G":[  
            {  
               "H":"val12",
               "I":"val13",
               "J":"val14"
            }
         ]
      }
   },
   {  
      "A":"val15",
      "B":"val16"
   },
   {  
      "A":"val8",
      "B":"val9",
      "C":"val10",
      "D":{  
         "E":[  
            {  
               "F":"val11"
            }
         ]
      }
   }
]

and the result was:

[
   {
      "A" : "val1",
      "B" : "val2",
      "C" : "val3",
      "D" : {
         "E" : [
            {
               "F" : "val4"
            }
         ],
         "G" : [
            {
               "H" : "val5",
               "I" : "val6",
               "J" : "val7"
            }
         ]
      }
   },
   {
      "A" : "val8",
      "B" : "val9",
      "C" : "val10",
      "D" : {
         "E" : [
            {
               "F" : "val11"
            }
         ],
         "G" : [
            {
               "H" : "val12",
               "I" : "val13",
               "J" : "val14"
            }
         ]
      }
   }
]

See the evidence and note that returning null option is set to true

Let me know if you need any further assistance on this.

dromualdo
  • 31
  • 1
  • I found a minor issue: - The problem of activating `DEFAULT_PATH_LEAF_TO_NULL` is that other functions will return nulls even when paths don't exist. Example: In Json ``` [ { "foo" : "foo1", "bar" : "bar1" }, { "foo" : "foo2" } ] ``` queries "$[*].bar" and " $..bar" will return ``` [ "bar1", null ] ``` where I wanted ``` [ "bar1" ] ``` Is there a query+configuration that would solve both problems? - do a matching for existing paths in [B,C,D.G] (or [B,C,..G]); and - not return nulls for `anything` and `seek` queries? – Humberto Apr 02 '19 at 13:43
  • actually @dromualdo, your query is almost there, but it doesn't check for the _existence_ of the property, rather that the property is NOT null. That is different. For example, on the JSON above, if you make [0].D.B = null, the JsonPath I'm looking for would return 2 objects, [0] and [1] - both have B,C and ..G and it only returns [1], because [0].B.G exists but [0].B.G=null. – Humberto Apr 02 '19 at 17:33
2

I've been trying some different approaches and I think a simpler expression does the trick:

$.*[?(@.B && @.C && @.D.G)]

This doesn't need any special config other than default (according to experiment done on https://jsonpath.herokuapp.com and yields the following result:

[
   {
      "A" : "val1",
      "B" : "val2",
      "C" : "val3",
      "D" : {
         "E" : [
            {
               "F" : "val4"
            }
         ],
         "G" : [
            {
               "H" : "val5",
               "I" : "val6",
               "J" : "val7"
            }
         ]
      }
   },
   {
      "A" : "val8",
      "B" : "val9",
      "C" : "val10",
      "D" : {
         "E" : [
            {
               "F" : "val11"
            }
         ],
         "G" : [
            {
               "H" : "val12",
               "I" : "val13",
               "J" : "val14"
            }
         ]
      }
   }
]

What do you think?

Luis Silva
  • 36
  • 1