0

I am facing issues with pubsub schema evolution and when I add new fields then their values end up being null in BigQuery. My project structure is as follows.

PubSub Schema

{
  "namespace": "com.value.model",
  "name": "ValuePoint",
  "type" : "record",
  "fields" : [
    {
      "name" : "application_id",
      "type" : "string"
    },
    {
      "name" : "user_id",
      "type" : "string"
    }
  ]
}

PubSub to BQ Subscription

  • Created PubSub Topic from above schema
  • Created a BQ Table with above same schema
  • Created PubSub to BQ Table Subscription

Updated Client

Client started sending avro messages to topic and it flowed fine to BQ.

Changing Schema(Evolution)

Added a new field and created a revision of schema

{
  "namespace": "com.value.model",
  "name": "ValuePoint",
  "type" : "record",
  "fields" : [
    {
      "name" : "application_id",
      "type" : "string"
    },
    {
      "name" : "user_id",
      "type" : "string"
    },
    {
      "name" : "new_field",
      "type" : [null, "string"],
      "default" : "null"
    }
  ]
}
  • Updated BQ Table schema as well
  • Started sending new data from client with new field.
  • But BQ table always showed value of new field as null.

Questions

  • I might be doing something wrong with evolution of schema as I am passing default value as null. But If I remove the default value then it gives me this error
"Revision is incompatible with previous revision: 4179a86e. Failed with error: Presence of non-optional field new_field is inconsistent.
  • Secondly Do I need to create new subscription whenever I update the schema revision ?

My settings for pubsub schema in topic enter image description here

SRJ
  • 2,092
  • 3
  • 17
  • 36
  • Can you try again after updating the schema as follows:\`{ "name": "new\_field", "type": ["null","string"],"default": null}.Let me know if that works! – kiran mathew Aug 24 '23 at 04:37
  • Thanks but I can't see any difference between what I tried already and what you are asking me to try :). Let me know what I am missing. – SRJ Aug 24 '23 at 08:02
  • Okay Saw your change null is quotes in double quotes. – SRJ Aug 24 '23 at 08:36
  • How long after evolving the schema were the values null? Schema changes are eventually consistent and so it is possible that for some time, the new columns will still not be populated. – Kamal Aboul-Hosn Aug 29 '23 at 11:24

0 Answers0