11

I am using Python's jsonschema to validate JSON files against a schema. It works great. But now I need to remove any properties from my JSON that are not present in the schema.

I know that according to the JSON Schema docs, I can set the property:

additionalProperties = false

to reject any files with additional properties. But this will just reject the properties, not actually remove them.

What is the best way to remove them?

I guess I can write my own script that:

  • walks every leaf node of the JSON file
  • checks whether the leaf node exists in the schema
  • if it does not, walks up the tree until it finds the highest node that does exist, then prunes the branch at that point.

My question is: is there an existing Python library to do this, or do I need to write one? I have Googled, but without any success.

Richard
  • 62,943
  • 126
  • 334
  • 542

2 Answers2

3

You could extend the validator. A similar implementation is used for default values.

A little bit late, but here is a solution.

I extend the validator to override validation of properties keyword. If one of properties exists in instance but not in the schema, I remove it from the instance.

from jsonschema import Draft7Validator, validators

def extend_validator(validator_class):
    validate_properties = validator_class.VALIDATORS["properties"]

    def remove_additional_properties(validator, properties, instance, schema):
        for prop in list(instance.keys()):
            if prop not in properties:
                del instance[prop]

        for error in validate_properties(
            validator, properties, instance, schema,
        ):
            yield error

    return validators.extend(
        validator_class, {"properties" : remove_additional_properties},
    )

DefaultValidatingDraft7Validator = extend_validator(Draft7Validator)

# Example usage:
obj = {
    'foo': 'bar',
    'not_in_schema': 'no no no'
}
schema = {
    'properties': {
        'foo': {
            'type': 'string'
        }
    }
}

DefaultValidatingDraft7Validator(schema).validate(obj)
assert obj == {'foo': 'bar'}
Raphael Medaer
  • 2,528
  • 12
  • 18
0

I don't see a straight-forward way to achieve this without monkey patching the iter_errors() method in the Validator class:

https://github.com/Julian/jsonschema/blob/master/jsonschema/validators.py#L296

Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485