If I do multiple of the below - one for each topic:
KafkaSource<T> kafkaDataSource = KafkaSource<T>builder().setBootstrapServers(consumerProps.getProperty("bootstrap.servers")).setTopics(topic).setDeserializer(deserializer).setGroupId(identifier)
.setProperties(consumerProps).build();
The deserializer seems to get into some issue and ends up reading data from different topic of different schema than it meant for and fails!
If I provide all topics in the same KafkaSource then watermarks seems to be progress across the topics together.
DataStream<T> dataSource = environment.fromSource(kafkaDataSource,
WatermarkStrategy.<T>forBoundedOutOfOrderness(Duration.ofMillis(2000))
.withTimestampAssigner((event, timestamp) -> {...}, ""));
Also the avro data in the kafka itself holds the first magic byte for schema (schema info is embedded), not using any external avro registry (it's all in the libraries).
It works fine with FlinkKafkaConsumer (created multiple instances of it).
FlinkKafkaConsumer<T> kafkaConsumer = new FlinkKafkaConsumer<>(topic, deserializer, consumerProps);
kafkaConsumer.assignTimestampsAndWatermarks(WatermarkStrategy.<T>forBoundedOutOfOrderness(Duration.ofMillis(2000))
.withTimestampAssigner((event, timestamp) -> {
Not sure if it's a problem the way that I am using? Any pointers on how to solve would be appreciated. Also FlinkKafkaConsumer is deprecated.