I am facing issues with pubsub schema evolution and when I add new fields then their values end up being null in BigQuery. My project structure is as follows.
PubSub Schema
{
"namespace": "com.value.model",
"name": "ValuePoint",
"type" : "record",
"fields" : [
{
"name" : "application_id",
"type" : "string"
},
{
"name" : "user_id",
"type" : "string"
}
]
}
PubSub to BQ Subscription
- Created PubSub Topic from above schema
- Created a BQ Table with above same schema
- Created PubSub to BQ Table Subscription
Updated Client
Client started sending avro messages to topic and it flowed fine to BQ.
Changing Schema(Evolution)
Added a new field and created a revision of schema
{
"namespace": "com.value.model",
"name": "ValuePoint",
"type" : "record",
"fields" : [
{
"name" : "application_id",
"type" : "string"
},
{
"name" : "user_id",
"type" : "string"
},
{
"name" : "new_field",
"type" : [null, "string"],
"default" : "null"
}
]
}
- Updated BQ Table schema as well
- Started sending new data from client with new field.
- But BQ table always showed value of new field as null.
Questions
- I might be doing something wrong with evolution of schema as I am passing default value as null. But If I remove the default value then it gives me this error
"Revision is incompatible with previous revision: 4179a86e. Failed with error: Presence of non-optional field new_field is inconsistent.
- Secondly Do I need to create new subscription whenever I update the schema revision ?