1

Overview

I have a lot of .yaml files, and a schema to validate them. Sometimes, a "incorrect" value, is in fact correct.

I need some way to ignore some fields. No validations should be performed on these fields.

Example

  ## file -- a.yaml
  some_dict:
      some_key: some_valid_value

  ## file -- b.yaml
  some_dict:
      some_key: some_INVALID_value # cerberus: ignore

How can I do this?

dreftymac
  • 31,404
  • 26
  • 119
  • 182
Pablo
  • 37
  • 5
  • Apparently interesting problem. Can you provide a small example of what you mean? – Bill Bell Sep 27 '18 at 16:42
  • For example: `- ip_lan: X.X.X.X location: XXX model: EdgeSwitch name: SW-XX-XXX offset: [-1, -1] parents: PTP-XX-XXX-REM` Suppose location field is ever required. This file is X.yaml There is one X.yaml for each client. Location field is required, except for 2 clients. So I need a way to set location not required, editing only .yaml files, not my program or my cerberus schema. – Pablo Sep 28 '18 at 09:44
  • If you have set `location` to be required in your validation schema, and you are not allowed to modify the validation schema in any way, then you are in a difficult situation. If you are allowed to make a slight modification to the validation schema, then there is a straightforward solution. Why can't you modify the validation schema? – dreftymac Dec 11 '18 at 01:35
  • @dreftymac It is not that I can not edit my schema, it is only that usually `location` field is required, but sometimes it is not, and it depends on what .yml file I am validating. It is like I have `a.yml`, `b.yml` and `c.yml`, in `a.yml` and `b.yml` `location` is required, but in `c.yml` it is not. Schema of all .yml files is the same, but only for some files `location` is not required. So I need a way to tell cerberus: hey, do not mark this field as required, even if schema says that. – Pablo Dec 12 '18 at 15:17
  • OK, so can you give an example of the conditions for requirement? Such as `location` not required if `model == CiscoXYZ` or something? What do you want to use as the trigger to let Cerberus know location is not required? – dreftymac Dec 12 '18 at 19:28
  • There is no way to determine if a value is valid or not, except comment. Like this: https://gist.github.com/palvarezcordoba/f3da9121ca4de821f5c6f46d568a65e3 a.yml is valid. b.yml is invalid, but I want cerberus to ignore it (not all the file, just the incorrect key-value) – Pablo Dec 14 '18 at 13:24
  • OK, are you allowed to modify the yaml name-value pairs in b.yml? ... or are you instead restricted, and only allowed to add comments? If you are only allowed to add comments, this is not a straightforward problem because comments are generally discarded and ignored in most YAML parsers. – dreftymac Mar 15 '19 at 22:30

1 Answers1

0

Quick Answer (TL;DR)

  • The "composite validation" approach allows for conditional (context-aware) validation rules.
  • The python cerberus package supports composite validation "out of the box".
  • YAML comments cannot be used for composite validation, however YAML fields can.

Detailed Answer

Context

  • python 2.7
  • cerberus validation package

Problem

  • Developer PabloPajamasCreator wishes to apply conditional validation rules.
  • The conditional validation rules become activated based on the presence or value other fields in the dataset.
  • The conditional validation rules need to be sufficiently flexible to change "on-the-fly" based on any arbitrary states or relationships in the source data.

Solution

  • This approach can be accomplished with composite data validation.
  • Under this use-case, composite validation simply means creating a sequential list of validation rules, such that:
    • Each individual rule operates on a composite data variable
    • Each individual rule specifies a "triggering condition" for when the rule applies
    • Each individual rule produces one of three mutually-exclusive validation outcomes: validation-success, validation-fail, or validation-skipped

Example

Sample validation rules
- rule_caption:     check-required-fields
  rule_vpath:       "@"
  validation_schema:
    person_fname:
      type: string
      required: true
    person_lname:
      type: string
      required: true
    person_age:
      type: string
      required: true

- rule_caption:     check-age-range
  rule_vpath:       '@|@.person_age'
  validation_schema:
    person_age:
      "min": 2
      "max": 120

- rule_caption:     check-underage-minor
  rule_vpath:       '[@]|[? @.person_age < `18`]'
  validation_schema:
    prize_category:
      type: string
      allowed: ['pets','toys','candy']
    prize_email:
      type:     string
      regex:    '[\w]+@.*'
  • The code above is a YAML formatted representation of multiple validation rules.

Rationale

  • This approach can be extended to any arbitrary level of complexity.
  • This approach is easily comprehensible by humans (although the jmespath syntax can be a challenge)
  • Any arbitrarily complex set of conditions and constraints can be established using this approach.

Pitfalls

  • The above example uses jmespath syntax to specify rule_vpath, which tells the system when to trigger specific rules, this adds a dependency on jmespath.

See also

dreftymac
  • 31,404
  • 26
  • 119
  • 182