2

I'd like to make use of the defaults that are defined within my schema. I found that there is already an example within the python-jsonschema faq: https://python-jsonschema.readthedocs.io/en/stable/faq/

The example extends the default validator for the property keyword and sets the defaults as required. However, I hit an issue as soon as I use the anyOf keyword within the same schema.

Let me give you an example:

from jsonschema import Draft7Validator, validators


def extend_with_default(validator_class):
    validate_properties = validator_class.VALIDATORS["properties"]

    def set_defaults(validator, properties, instance, schema):
        for property, subschema in properties.items():
            if "default" in subschema:
                instance.setdefault(property, subschema["default"])

        for error in validate_properties(
            validator, properties, instance, schema,
        ):
            yield error

    return validators.extend(
        validator_class, {"properties": set_defaults},
    )


DefaultValidatingDraft7Validator = extend_with_default(Draft7Validator)

obj = {
    "my_list": [{"class_name": "some_class"}]
}
schema = {
    "properties": {
        "my_list": {
            "type": "array",
            "items": {
                "anyOf": [
                    {
                        "type": "object",
                        "properties": {
                            "class_name": {
                                "const": "some_class"
                            },
                            "some_property": {
                                "type": "number",
                                "default": 1
                            }
                        },
                        "required": ["class_name", "some_property"],
                        "additionalProperties": False
                    },
                    {
                        "type": "object",
                        "properties": {
                            "class_name": {
                                "const": "another_class"
                            },
                            "another_property": {
                                "type": "number",
                                "default": 1
                            }
                        },
                        "required": ["class_name", "another_property"],
                        "additionalProperties": False
                    }
                ]
            }
        }
    }
}

DefaultValidatingDraft7Validator(schema).validate(obj)
print(obj)

This example actually works as intended. Running it provides the following output:

{'my_list': [{'class_name': 'some_class', 'some_property': 1}]}

So the property some_property was correctly set with the default value of 1. However, if we now change the class_name within the object to another_class, which fits the second entry within the anyOf list, we get the following issue:

obj = {
    "my_list": [{"class_name": "another_class"}]
}

=>

jsonschema.exceptions.ValidationError: {'class_name': 'another_class', 'some_property': 1, 'another_property': 1} is not valid under any of the given schemas

Failed validating 'anyOf' in schema['properties']['my_list']['items']:
    {
        "anyOf": [
            {
                "type": "object",
                "properties": {
                    "class_name": {
                        "const": "some_class"
                    },
                    "some_property": {
                        "type": "number",
                        "default": 1
                    }
                },
                "required": ["class_name", "some_property"],
                "additionalProperties": False
            },
            {
                "type": "object",
                "properties": {
                    "class_name": {
                        "const": "another_class"
                    },
                    "another_property": {
                        "type": "number",
                        "default": 1
                    }
                },
                "required": ["class_name", "another_property"],
                "additionalProperties": False
            }
        ]
    }

On instance['my_list'][0]:
    {'another_property': 1,
     'class_name': 'another_class',
     'some_property': 1}

While iterating over the anyOf list, the instance given to anyOf was already changed by the first subschema. The anyOf validator calls all relevant validators of each subschema, as a result the properties validator of the first subschema inserts the defaults of the first subschema into the current instance. This also happens when the validation of a subschema is not successful within anyOf. As a result, the first subschema which does not fit in this example inserts the property 'some_property': 1:

obj = {'class_name': 'another_class', 'some_property': 1, 'another_property': 1}

So now as anyOf gets to the second subschema which usually would fit the object, another key has been added to the instance, causing the validation to fail as well as no additionalProperties are allowed. As a result no valid schema was found within anyOf and we get the above error.

So how can this issue be fixed? My approach is to have anyOf store the value of the instance while iterating over the list of subschemas. If a subschema does not match, all changes done by this subschema should be reverted. Unfortunately until now, I was not able to implement this behavior.

For reference, this is what my latest try looks like:

def extend_with_default(validator_class):
    validate_properties = validator_class.VALIDATORS["properties"]

    def set_defaults(validator, properties, instance, schema):
        for property, subschema in properties.items():
            if "default" in subschema:
                instance.setdefault(property, subschema["default"])

        for error in validate_properties(
            validator, properties, instance, schema,
        ):
            yield error

    def any_of(validator, subschemas, instance, schema):
        instance_copy = instance.copy()

        all_errors = []
        for index, subschema in enumerate(subschemas):
            errs = list(validator.descend(instance, subschema, schema_path=index))
            if not errs:
                break
            instance = instance_copy  # Make sure an instance that did not fit is not modified
            all_errors.extend(errs)
        else:
            yield ValidationError(
                "%r is not valid under any of the given schemas" % (instance,),
                context=all_errors,
            )

    return validators.extend(
        validator_class, {"properties": set_defaults, "anyOf": any_of},
    )

Internally this seems to work, the verification also works. But for some reason the content of obj which was given as {"my_list": [{"class_name": "another_class"}]} is now:

{'my_list': [{'class_name': 'another_class', 'some_property': 1}]}

I don't understand why. I guess the dictionaries are changed while being passed through the validators as they are mutable. So trying to reset the instance does probably not have the desired effect in a global context. However, I am not able to figure out how to fix this. Can someone assist?

Tim Keller
  • 389
  • 1
  • 12
  • The way 'default' is used there isn't correct according to the spec, so you'll have to speak with the author(s) of this particular implementation to see how they intend you use it in this case. – Ether Oct 27 '20 at 16:20
  • @Ether: What do you mean it isn't correct? How is it wrong? – Tim Keller Oct 28 '20 at 08:25
  • "default" is an annotation, which is generated at the current data location when the schema it is in has validated successfully. When a property is missing, the corresponding subschema isn't evaluated at all, so the annotation is not generated. The proper place to put 'default' is at the object level: e.g. `{ "type": "object", "default": { "val1": 1, "val2": 2, ... }, "properties": { "val1": { ... }, "val2": { ... } } }` – Ether Oct 28 '20 at 23:02
  • @Ether: Thanks for the explanation. Do you have some reference where the usage of the "default" keyword is defined? I couldn't find anything. Unfortunately this approach does also not work in this context as only the "properties" validator is extended. – Tim Keller Oct 29 '20 at 08:35
  • It's not very clear in earlier drafts, but the latest (current) draft is much more specific about its use: https://json-schema.org/draft/2019-09/json-schema-validation.html#rfc.section.9 and annotations in general are defined in https://json-schema.org/draft/2019-09/json-schema-core.html#rfc.section.7.7 – Ether Oct 29 '20 at 15:38

0 Answers0