1

I am trying to reference all nested properties as string regardless of name.

An example of the data looks like this (except with a bunch of columns):

[
    {
        "var_1": "some_string",
        "var_2": {
                "col_1": "x",
                "col_2": "y",
                "col_3": "z"
                },
        "var_3": "another_string"
        
    }
]

I used a yaml to json converter and got the following json but my process to flatten the file does not seem to get the nested information.

{
  "$id": "main.json",
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "some data",
  "type": "object",
  "properties": {
    "var_1": {
      "$ref": "another_schema.json#/definitions/var_1"
    },
    "var_2": {
      "type": "object",
      "properties": {
        "fieldNames": {
          "uniqueItems": true,
          "type": "array",
          "items": {
            "type": "string"
          }
        }
      }
    },
    "var_3": {
      "type": "string",
      "description": "another variable"
    }
  }
}

Is there another way to reference all the variables/items inside of fields/fieldNames (col_1, col_2, col_3)

alexb523
  • 718
  • 2
  • 9
  • 26
  • 1
    Your schema isn't valid JSON (just count your braces).... Also, in the schema you define the top level to be an `object` but your content starts with a `[`, which is an `array`. And your property `var2` is defined as an `array` but your content has it being an `object` (by means of `{}`). And, could you elaborate on what you mean by "reference all the variables/items"? – Daniel Schneider Nov 04 '22 at 11:18
  • @DanielSchneider I have updated the json to be valid. I want to be able to create a schema for all the columns 1, 2, 3 without explicitly calling them. – alexb523 Nov 04 '22 at 15:01
  • Provided an answer below -- let me know if that wasn't what you were after. – Daniel Schneider Nov 06 '22 at 13:40

1 Answers1

1

I assume that you want to enforce that all properties under var_2 are of type string. I can think of 2 ways of doing that:

  1. Define additionalProperties with additional constraints, concretely "type": "string":
      "var_2": {
        "type": "object",
        "additionalProperties": {
          "type": "string"
         }
      },
  1. Use of patternProperties matching all field names (".*"). Here you define constraints for a regex matching against the field names (.* will match all field names), concretely:
      "var_2": {
        "type": "object",
        "patternProperties": {
          ".*": {
            "type": "string"
          }
        },
        "additionalProperties": false
      },

Putting both into one schema (and adding the fact that your content starts with an array) would give you this:

{
  "$id": "main.json",
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "array",
  "items": {
    "title": "some data",
    "type": "object",
    "properties": {
      "var_1": {
        "type": "string"
      },
      "var_2a": {
        "type": "object",
        "patternProperties": {
          ".*": {
            "type": "string"
          }
        },
        "additionalProperties": false
      },
      "var_2b": {
        "type": "object",
        "additionalProperties": {
          "type": "string"
         }
      },
      "var_3": {
        "type": "string"
      }
    },
    "additionalProperties": false
  }
}

Will validate this:

[
    {
        "var_1": "some_string",
        "var_2a": {"foo": "x", "bar": "y"},
        "var_2b": {"foo": "x", "bar": "y"},
        "var_3": "another_string"        
    }
]

But fail this:

[
    {
        "var_1": "some_string",
        "var_2a": {"foo": 1},
        "var_2b": {"foo": true},
        "var_3": "another_string"        
    }
]
Daniel Schneider
  • 1,797
  • 7
  • 20
  • Thank you, @DanielSchneider. This is a really helpful explanation. It seems like this should/would, but it is failing when it runs through this line of python: `json_schema["properties"].items()` with the error : `"KeyError: 'properties'"`. Any idea what I should do to fix that? – alexb523 Nov 06 '22 at 16:19
  • The above data will validate fine against the schema if you use a tool like `Draft7Validator` from `jsonschema` -- how are you validating the data against the schema? – Daniel Schneider Nov 06 '22 at 22:53
  • i think your response is correct, but the way the information is being past threw an aws step function unfortunately only allows for strict schemas. thanks for your help. – alexb523 Nov 08 '22 at 01:47