Spark structured streaming job not processing stages and showing in hung state

Question

I am running one streaming application and processing data from Kafka to Kafka using spark. If i am using latest then its working as expected and running without any issue.

but in source we have done bulk transaction (200 000) and using earliest then processing the data. In that case our spark job is not processing data and its stuck after 3 stages. can someone suggest me how should i handle this, so I can process this bulk data data.

I am using below configurations:

TRIGGERFREQUENCY      1 seconds
STARTINGOFFSETS       earliest
--num-executors 6 
--driver-cores 6 
--driver-memory 8G
--executor-cores 6
 --executor-memory 8G

I have tried below configuration in my spark application.

--conf spark.streaming.backpressure.enabled=true
--conf spark.streaming.backpressure.initialRate=60 
--conf spark.streaming.kafka.maxRatePerPartition=50

To control no of events in a batch but its not taking this and I am bale to see 30000 records in first batch which is spark is not able to process in single batch and stuck.

Please try to avoid using words like lakh that can confuse people, and use measurements understood globally. — James Z, May 17 '22 at 19:43

score 0 · Answer 1 · answered May 18 '22 at 12:34

0

we just need to add below properties in Kafka consumer parameters.

max.partition.fetch.bytes=15728640
Or
maxOffsetsPerTrigger=100000

answered May 18 '22 at 12:34

Sonu

77
11

Spark structured streaming job not processing stages and showing in hung state

1 Answers1