2

I'm trying to recursively validate a custom JSON schema against a template JSON Schema using the jsonschema module in Python 3.

The custom JSON looks like this:

{
  "endpoint": "rfc",
  "filter_by": ["change_ref", "change_i"],
  "expression": [
    {
      "field": "first_name",
      "operator": "EQ",
      "value": "O'Neil"
    },
    "AND",
    [
      {
        "field": "last_name",
        "operator": "NEQ",
        "value": "Smith"
      },
      "OR",
      {
        "field": "middle_name",
        "operator": "EQ",
        "value": "Sam"
      }
    ]
  ],
  "limit_results_to": "2"
} 

The above can be generalized further by adding multiple ANDs and ORs => my question related to recursivity.

The template that I'm trying to validate this schema against of is in the following piece of code:

import json
import jsonschema


def get_data(file):
    with open(file) as data_file:
        return json.load(data_file)


def json_schema_is_valid():
    data = get_data("other.json")
    valid_schema = {
        "type": "object",
        "required": ["endpoint", "filter_by", "expression", "limit_results_to"],
        "properties": {
            "endpoint": {
                "type": "string",
                "additionalProperties": False
            },
            "filter_by": {
                "type": ["string", "array"],
                "additionalProperties": False
            },
            "limit_results_to": {
                "type": "string",
                "additionalProperties": False
            },
            "expression": {
                "type": "array",
                "properties": {
                    "field": {
                        "type": "string",
                        "additionalProperties": False
                    },
                    "operator": {
                        "type": "string",
                        "additionalProperties": False
                    },
                    "value": {
                        "type": "string",
                        "additionalProperties": False
                    }
                },
                "required": ["field", "operator", "value"]
            }
        }
    }
    return jsonschema.validate(data, valid_schema)


if __name__ == '__main__':
    print(json_schema_is_valid())

Now, something seems wrong because when I run the above code, I get None which might (not) be okay. When I'm trying to modify the type of a property in something that isn't allowed, I don't get any excception. Is there something wrong in my template? Here, it looks like the expression properties are not parsed. More, I read here that I can make my template to recursively validate my custom JSON schema using '$ref': '#' but I didn't quite understand how to use it. Could someone give me some hints?

1 Answers1

0

Your schema looks to work, excluding the recursive part. Looking at the source code on GitHub for jsonschema.validate, we can see that the code doesn't have a return. And so I think it's safe to assume that the way you validate would be using something like:

try:
    jsonschema.validate(json_data, schema)
except ...:
    print('invalid json')
else:
    print('valid json')

To create your recursion you should make two definitions. I found out how to do this from Recursive JSON Schema. You just need to make a couple of definitions. First are your normal comparisons. This is pretty much just moving your current definition out into it's own definition.

{
    "definitions": {
        "comparison": {
            "type": "object",
            "additionalProperties": false,
            "properties": {
                "field": {
                    "type": "string"
                },
                "operator": {
                    "type": "string"
                },
                "value": {
                    "type": "string"
                }
            },
            "required": ["field", "operator", "value"]
        }
    }
}

To perform your recursive boolean comparisons is a little harder. I don't know how to limit arrays to three items, where the second is a different type so I opted to using an object, which has three well defined items. Also the first and last item should have the same type so that you can do comparisons like (a and b) or (c and d). And so I got the following schema:

{
    "definitions": {
        "comparison": {
            "type": "object",
            "additionalProperties": false,
            "properties": {
                "field": {
                    "type": "string"
                },
                "operator": {
                    "type": "string"
                },
                "value": {
                    "type": "string"
                }
            },
            "required": ["field", "operator", "value"]
        },
        "booleanComparison": {
            "type": "object",
            "additionalProperties": false,
            "properties": {
                "item1": {
                    "type": "object",
                    "oneOf": [
                        {"$ref": "#/definitions/comparison"},
                        {"$ref": "#/definitions/booleanComparison"}
                    ]
                },
                "operator": {
                    "type": "string"
                },
                "item2": {
                    "type": "object",
                    "oneOf": [
                        {"$ref": "#/definitions/comparison"},
                        {"$ref": "#/definitions/booleanComparison"}
                    ]
                }
            },
            "required": ["item1", "operator", "item2"]
        }
    },
    "type": "object",
    "properties": {
        "endpoint": {
            "type": "string"
        },
        "filter_by": {
            "type": ["string", "array"]
        },
        "limit_results_to": {
            "type": "string",
            "additionalProperties": false
        },
        "expression": {
            "type": "object",
            "oneOf": [
                {"$ref": "#/definitions/comparison"},
                {"$ref": "#/definitions/booleanComparison"}
            ]
        }
    },
    "required": ["endpoint", "filter_by", "expression", "limit_results_to"]
}

Which says the following is valid:

{
    "endpoint": "rfc",
    "filter_by": ["change_ref", "change_i"],
    "limit_results_to": "2",
    "expression": {
        "item1": {
            "field": "first_name",
            "operator": "EQ",
            "value": "O'Neil"
        },
        "operator": "AND",
        "item2": {
            "item1": {
                "field": "last_name",
                "operator": "NEQ",
                "value": "Smith"
            },
            "operator": "OR",
            "item2": {
                "field": "middle_name",
                "operator": "EQ",
                "value": "Sam"
            }
        }
    }
}

However says it's invalid if you change "value": "Sam" to "value": true, as it's the wrong type. So it seems to work as intended recursively.

Peilonrayz
  • 3,129
  • 1
  • 25
  • 37