Exceptions while running Spark job on EMR cluster "java.io.IOException: All datanodes are bad"

Question

We have AWS EMR setup to process jobs which are written in Scala. We are able to run the jobs on small dataset, but while running same job on large dataset I get exception "java.io.IOException: All datanodes are bad."

Devendra Parhate · Accepted Answer · 2019-05-02T15:44:39.207

3

Setting spark.shuffle.service.enabled to true resolved this issue for me.

The default configuration of AWS EMR has set spark.dynamicAllocation.enabled to true, but spark.shuffle.service.enabled is set to false.

spark.dynamicAllocation.enabled allows Spark to assign the executors dynamically to different task. The spark.shuffle.service.enabled when set to false disables the external shuffle service and data is stored only on executors. When the executors is reassigned the data is lost and the exception "java.io.IOException: All datanodes are bad." is thrown for data request.

edited May 02 '19 at 15:44

answered Apr 30 '19 at 16:34

Devendra Parhate

135
1
2
12

It is enabled in my cluster, but I still see the same issue from time to time. – dytyniak Nov 05 '21 at 14:12

Exceptions while running Spark job on EMR cluster "java.io.IOException: All datanodes are bad"

1 Answers1