We started receiving this generic today-
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: java.io.EOFException
Saw some articles talking about this being from big files, missing libraries, or memory constraints.
https://datascience.stackexchange.com/questions/40130/pyspark-java-io-eofexception
PySpark throws java.io.EOFException when reading big files with boto3