4

Our system consists of multiple microservices that emit and consume events encoded in avro format (see schema at the bottom). A particular use case is the following: Service A emits an event (of type InvoiceEvents) on topic T1 and Services B and C (different dev teams) are consuming from T1. E.g. Service B is part of the Tax team, while Service C is part of the Product Fulfilment team.

I was expecting the following to be true (but it seems not to be):

  1. The schema could evolve from version 1 (v1) to version 2 (v2) by adding a new union type (i.e. InvoiceCreated for field "payload") - Check out sample schemas at the bottom.
  2. The producing Service A to upgrade to v2 (i.e. producing events that follow v2)
  3. Some consuming services (e.g. Service C) could still use v1, as they are not interested in the new event type (i.e. InvoiceCreated). In this case, the "payload" field will use the default (null) value when de-serialised.
  4. Eventually and only if required for business reasons service C can upgrade to use v2, if there is a requirement to react on the new event type (i.e. InvoiceCreated).

But Service C cannot de-serialize new events of type InvoiceCreated. Specifically it is throwing:

org.apache.avro.AvroTypeException: Found com.elsevier.q2c.schema.avro.invoice.InvoiceCreated, expecting unionorg.apache.avro.AvroTypeException: Found com.elsevier.q2c.schema.avro.invoice.InvoiceCreated, expecting union at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:292) at 

Are avro union types not forward compatible (as described above)? Are they only backwards compatible as implied by the Confluent Schema Registry tests. What is the proposed way to avoid the coupling of microservices? I guess avro unions cannot be used..

Thanks!!

Related link with no definite answer: Avro-union-compatibility-mode-enhancement-proposal


schema v1:

[
   ...
   {
      "type":"record",
      "name":"InvoiceEvents",
      "namespace":"bla.bla.schema.avro.invoice",
      "fields":[
         {
            "name":"payload",
            "type":[
               "null",
               "bla.bla.schema.avro.invoice.InvoiceDrafted"
            ],
            "default":null
         }
      ]
   }
]

schema v2 (added new Union type: InvoiceCreated):

[
   ...
   {
      "type":"record",
      "name":"InvoiceEvents",
      "namespace":"bla.bla.schema.avro.invoice",
      "fields":[
         {
            "name":"payload",
            "type":[
               "null",
               "bla.bla.schema.avro.invoice.InvoiceDrafted",
               "bla.bla.schema.avro.invoice.InvoiceCreated",
            ],
            "default":null
         }
      ]
   }
]
Vassilis
  • 914
  • 8
  • 23
  • The [Avro specification](https://avro.apache.org/docs/current/spec.html#Schema+Resolution) says if the selected writer's union schema is not one of the schemas in the reader's union, then an error is signaled. – Chin Huang Jan 30 '19 at 18:59
  • indeed adding a new type at a union is a non forward compatible change – Vassilis Jan 31 '19 at 16:08

1 Answers1

1

After some thought we will probably go for option 3 as not skipping/ losing events is more important to the project than decoupling:

  1. Handle exception in custom derserialiser and skip event (may lose interesting events - For not losing events all consuming services must be upgraded before all the producing services)
  2. Convert all custom record unions to separate optional fields (may lose interesting events, as change is forward compatible and consuming services will not block)
  3. Accept de-serialisation error/ block consumption and bump version in all consuming services that use schema on new custom record type (this guarantees that no interesting event is lost).

Please comment if there is a better option out there and I have missed it!

UPDATE: It seems that option (2) is now possible in a much cleaner way, as now you can have multi-type topics (https://github.com/confluentinc/schema-registry/pull/680). This means that a topic can have different value types (e.g. InvoiceCreated, InvoiceEdited, ...) without using an avro union, while each different type will have its own evolution line!

Vassilis
  • 914
  • 8
  • 23