1

I came across the following

For possible kafkaParams, see Kafka consumer config docs. If your Spark batch duration is larger than the default Kafka heartbeat session timeout (30 seconds), increase heartbeat.interval.ms and session.timeout.ms appropriately. For batches larger than 5 minutes, this will require changing group.max.session.timeout.ms on the broker

on this link https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html

Does this apply if I have the below property set on my spark conf

conf.set("spark.streaming.kafka.consumer.poll.ms", "5000")

Also what is the rationale behind setting heartbeat.interval.ms and session.timeout.ms larger than kafka stream batch duration? Won't heartbeats to kafka piggy back on consumer poll request?

Also I was running spark stream application and kafka on my local machine. My batch size was 1 minute, and my kafka configuration was as follows

heartbeat.interval.ms = 3000
session.timeout.ms = 30000

However I did not really see any problems when running with batch duration of 1 minute and above values for heartbeat interval and session timeout. Am I missing something here?

Ruslan Ostafiichuk
  • 4,422
  • 6
  • 30
  • 35
Abdul Rahman
  • 1,294
  • 22
  • 41
  • Perhaps I am not seeing any rebalancing because my spark streaming application is the only consumer to the topic? so even though my batch duration is larger than my heartbeat and session timeout intervals, kafka doesnt kick my consumer out? is that possible? – Abdul Rahman Jul 25 '18 at 02:12

0 Answers0