According to Avro Schema specification (for Unions): https://avro.apache.org/docs/current/spec.html
Unions Unions, as mentioned above, are represented using JSON arrays. For example, ["null", "string"] declares a schema which may be either a null or string.
(
Note that when a default value is specified for a record field whose type is a union, the type of the default value must match the first element of the union.
Thus, for unions containing "null", the "null" is usually listed first, since the default value of such unions is typically null.)
It appears from the standard, when declaring unions, the first word must be the default value and the second must be the data type.
In our product, we are using Avro encoding with the following Schema:
{
"name": "data",
"type": {
"name": "data",
"type": "record",
"fields": [
{
"name": "data_asset",
"type": ["string", "null"],
"default": null,
"doc": "The serialized JSON describing the entity - can be null for special cases"
}
]
}
}
What we have found is that, while Unions have a "MUST" requirement that the first item is the default, no errors are thrown by the Schema-validator when we reverse the order (["string", "null"]) as shown above.
The question I have is: Why does the validation pass, even though it is "incorrect" as per the standard?