I am using Spark 2.4.0 Structured Streaming (Batch Mode i.e. spark .read vs .readstream)to consume a Kafka topic. I am checkpointing read offsets and using the .option("startingOffsets", ...)
to dictate where to continue reading on next job run.
In the docs is says Newly discovered partitions during a query will start at earliest.
However testing showed that when a new partition is added and I use the last checkpoint info, I get the following error:
Caused by: java.lang.AssertionError: assertion failed: If startingOffsets contains specific offsets, you must specify all TopicPartitions.
How can I check programmatically if any new partitions were created so that I can update my startingOffsets param?