In my Spark job I initialize Kafka stream with KafkaUtils.createDirectStream
.
I read about seekToEnd
method for Consumer
. How can I apply it to the stream?
In my Spark job I initialize Kafka stream with KafkaUtils.createDirectStream
.
I read about seekToEnd
method for Consumer
. How can I apply it to the stream?
spark-kafka transitively includes kafka-clients, so you're welcome to initialize the raw consumer instance on your own and seek it
Alternatively, if no consumer group exists, you would set startingOffsets=latest
in your Spark config
note: Kafka Direct Stream API is deprecated as of Spark 2.4 and you should be using Structured Streaming instead