3

I am trying to write a dataframe to an s3 location in JSON format. But whenever an executor task fails and Spark retries the stage it throws a FileAlreadyExistsException.

A similar question has been asked before but it addresses ORC files with a separate spark conf and doesn't address my issue.

This is my code:

val result = spark.sql(query_that_OOMs_executor)
result.write.mode(SaveMode.Overwrite).json(s3_path)

From the spark UI, the error on the executor says

ExecutorLostFailure (executor 302 exited caused by one of the running tasks) 
Reason: Container killed by YARN for exceeding memory limits. 4.5 GB of 4.5 GB physical memory used. 
Consider boosting spark.yarn.executor.memoryOverhead or disabling yarn.nodemanager.vmem-check-enabled because of YARN-4714.

But the driver stack trace says

Job aborted due to stage failure: Task 1344 in stage 2.0 failed 4 times, most recent failure: Lost task 1344.3 in stage 2.0 (TID 25797, executor.ec2.com, executor 217): org.apache.hadoop.fs.FileAlreadyExistsException: s3://prod-bucket/application_1590774027047/-650323473_1594243391573/part-01344-dc971661-93ef-4abc-8380-c000.json already exists

How do I make it so that spark tries to overwrite this JSON file? This way I'll get the real reason on the driver once all 4 retries fail. I've already set the mode to overwrite so that's not helping.

sbrk
  • 1,338
  • 1
  • 17
  • 25
  • 1
    I also had same issue .. sometime it will work and sometimes it won't..to solve this issue - Added code to delete directory before writing.. – Srinivas Jul 09 '20 at 03:26
  • @Srinivas that wouldn't work bc in my case I make sure that the `s3_path` is unique before calling `.json(s3_path)`, the exception occurs when a task fails while writing and the path already exists. – sbrk Jul 09 '20 at 14:45
  • What `FileOutputCommitter` are you using? You might be interested to check this out https://hadoop.apache.org/docs/r3.1.1/hadoop-aws/tools/hadoop-aws/committers.html. – mazaneicha Jul 10 '20 at 00:37

1 Answers1

3

This issue happened because of a fundamental issue with the DirectFileOutputCommitter which was being used here by default.

There are two things here: the executor OOM and then the FileAlreadyExistsException on retries causing the retries (and hence the SQL query) to fail.

Reason: The DirectFileOutputCommitter will try to write the output files in a single task attempt to the final output path. It’ll do that by writing to a staging directory and then renaming to the final path and deleting the original. This is bad and prone to inconsistencies and errors and is also not recommended by Spark.

Instead, I used the Netflix S3 committer which would do this in a multipart fashion. It’ll first write files on the local disk, then during task commit each of these would be uploaded to S3 in multi-part but won’t be immediately visible, then during the job commit (which will happen only when all tasks have completed successfully so is a safe operation) the local disk data will be deleted and the upload will be complete (now data will be visible on S3). This prevents failed tasks directly writing stuff to S3 and hence avoid the FileAlreadyExistsException on retrying.

Now for the executor OOMs — they are still happening for my query, but the retries succeed which were also failing before with the DirectFileOutputCommitter.

To solve this, I basically did

set spark.sql.sources.outputCommitterClass=com.netflix.s3.S3DirectoryOutputCommitter;
sbrk
  • 1,338
  • 1
  • 17
  • 25