5

I am currently running a spark job on Dataproc and am getting errors trying to re-join a group and read data from a kafka topic. I have done some digging and am not sure what the issue is. I have auto.offset.reset set to earliest so it should being reading from the earliest available non-committed offset and initially my spark logs look like this :

19/04/29 16:30:30 INFO     
org.apache.kafka.clients.consumer.internals.Fetcher: [Consumer 
clientId=consumer-1, groupId=demo-group] Resetting offset for 
partition demo.topic-11 to offset 5553330.
19/04/29 16:30:30 INFO     
org.apache.kafka.clients.consumer.internals.Fetcher: [Consumer 
clientId=consumer-1, groupId=demo-group] Resetting offset for 
partition demo.topic-2 to offset 5555553.
19/04/29 16:30:30 INFO 
org.apache.kafka.clients.consumer.internals.Fetcher: [Consumer 
clientId=consumer-1, groupId=demo-group] Resetting offset for 
partition demo.topic-3 to offset 5555484.
19/04/29 16:30:30 INFO 
org.apache.kafka.clients.consumer.internals.Fetcher: [Consumer 
clientId=consumer-1, groupId=demo-group] Resetting offset for 
partition demo.topic-4 to offset 5555586.
19/04/29 16:30:30 INFO 
org.apache.kafka.clients.consumer.internals.Fetcher: [Consumer 
clientId=consumer-1, groupId=demo-group] Resetting offset for 
partition demo.topic-5 to offset 5555502.
19/04/29 16:30:30 INFO 
org.apache.kafka.clients.consumer.internals.Fetcher: [Consumer 
clientId=consumer-1, groupId=demo-group] Resetting offset for 
partition demo.topic-6 to offset 5555561.
19/04/29 16:30:30 INFO 
org.apache.kafka.clients.consumer.internals.Fetcher: [Consumer 
clientId=consumer-1, groupId=demo-group] Resetting offset for 
partition demo.topic-7 to offset 5555542.```

But then the very next line I get an error trying to read from a nonexistent offset on the server (you can see that the offset for the partition differs from the one listed above, so I have no idea why it would be attempting to read form that offset, here is the error on the next line:

org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets 
out of range with no configured reset policy for partitions: 
{demo.topic-11=4544296}

Any ideas to why my spark job is constantly going back to this offset (4544296), and not the one it outputs originally (5553330)?

It seems to be contradicting itself w a) the actual offset it says its on and the one it attempts to read and b) saying no configured reset policy

Jared
  • 85
  • 1
  • 5
  • Using Structured Streaming or Dstream? – OneCricketeer Apr 29 '19 at 18:00
  • @cricket_007 dstreams – Jared Apr 29 '19 at 18:01
  • According to the documentation, the property value should be set to `smallest`, not `earliest`, but also if the consumer has not been started in sometime, then consumer groups will expire the offsets, causing the app to "reset" to a different value... Meanwhile, Spark might be trying to recover from a checkpoint, for example, and offsets are also stored there (or elsewhere, if you've configured it that way) – OneCricketeer Apr 29 '19 at 19:31
  • @cricket_007 "smallest" and "largest" are property values for the old consumer config, I am using the new consumer config which was updated a few spark versions ago https://kafka.apache.org/documentation.html#newconsumerconfigs any other values than "earliest" and "latest" will throw a consumer config error – Jared Apr 29 '19 at 20:13
  • Sorry, was reading the `streaming-0-8` docs. How are you [storing offsets](https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html#storing-offsets)? – OneCricketeer Apr 29 '19 at 20:58
  • I think on one hand, you can check the offset by following https://stackoverflow.com/questions/34019386/how-to-check-consumer-offsets-when-the-offset-store-is-kafka, on the other hand, you need to check why it says "no configured reset policy". – Dagang Apr 30 '19 at 20:54

1 Answers1

2

One year to late with this answer, but hoping to help others facing a similar issue.

Typically, this behavior shows when the consumer tries to read an offset in a Kafka topic that is not there anymore. The offset is not there anymore, usually because it has been removed by the Kafka Cleaner (e.g. due to retention or compaction policies). However, the Consumer Group is still known to Kafka and Kafka kept the information on the latest consumed message of the group "demo-group" for the topic "demo.topic" and all its partitions.

Therefore, the auto.offset.reset configuration does not have any impact, because there is no need for a reset. Instead Kafka knows the Consumer Group.

In addition, the Fetcher only tells you the latest available offset within each partition of the topic. It does not automatically mean that it actually polls all messages up to this offset. Spark decides how many messages it actually consumes and processes for each partition (based on e.g. the configuration maxRatePerPartition).

To solve this issues you could either change the Consumer Group (which is probably not what you want in this particular case) or manually reset the offsets for the consumer Group "demo-group" by using

bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --reset-offsets --group demo-group --topic demo.topic --partition 11 --to-latest

Depending on your requirement you can reset the offsets for each partition of the topic with that tool. The help function or documentation explains all available options.

Michael Heil
  • 16,250
  • 3
  • 42
  • 77