PySpark: java.io.EOFException

Question

We started receiving this generic today-

Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: java.io.EOFException

Saw some articles talking about this being from big files, missing libraries, or memory constraints.

score 0 · Accepted Answer · answered Apr 12 '21 at 15:44

0

For us it ended up being an empty .seq file that was written by one of our ETL tools. Removing that invalid file resolved the issue for us.

answered Apr 12 '21 at 15:44

DetroitMike

1 Answers1