S3 Slow Down exception for Spark program

Question

I have simple spark program running in EMR cluster trying to convert 60 GB of CSV file into parquet. When i submit the job i get below exception.

391, ip-172-31-36-116.us-west-2.compute.internal, executor 96): org.apache.spark.SparkException: Task failed while writing rows.
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:285)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:197)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:196)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)Caused by: com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Slow Down (Service: Amazon S3; Status Code: 503; Error Code: 503 Slow Down; Request ID: D13A3F4D7DD970FA; S3 Extended Request ID: gj3cPalkkOwtaf9XN/P+sb3jX0CNHu/QF9WTabkgP2ISuXcXdbvYO1Irg0O54OCvKlLz8WoR8E4=), S3 Extended Request ID: gj3cPalkkOwtaf9XN/P+sb3jX0CNHu/QF9WTabkgP2ISuXcXdbvYO1Irg0O54OCvKlLz8WoR8E4=
at com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639)

score 5 · Answer 1 · answered May 15 '18 at 07:40

503 Slow Down is a generic response from AWS services when you're doing too many requests per second.

Possible solutions:

Copy your file to HDFS first.
Do you have one 60 Gb file or a lot of files that sums up to 60 Gb? If you have a lot of small files, try to combine them first.
Try to decrease the number of partitions in your Parquet output, if you can. df.repartition(100)
Try using less Spark workers. val spark = SparkSession.builder.appName("Simple Application").master("local[1]").getOrCreate()

score 2 · Answer 2 · answered May 15 '18 at 08:34

2

I'm surprised that things failed; the Apache s3a client backs off when it sees a problem like this: your work is done, just more slowly.

All of Sergey's advice is good. I'd start by coalescing small files and reducing workers: a smaller cluster can deliver more performance, and save money.

One more: if you are using SSE-KMS to encrypt the data, accessing that key can trigger throttle events too; throttling shared across all applications trying to use the KMS store.

answered May 15 '18 at 08:34

stevel

12,567
1
39
50

1

thanks for your response. yes am using KMS for encryption. but it is KMS/s3. AS a workaround i have written the data to hdfs and then tried to do s3-dist-cp still the same error during s3-dist-cp – kalyan chakravarthy May 15 '18 at 21:10
second thing is as am using EMR cluster i think t is not using apache s3a filesystem. it should be using EMRFS library. – kalyan chakravarthy May 15 '18 at 21:33

S3 Slow Down exception for Spark program

2 Answers2