-1

I am running Spark codes in EC2 instance. I am running into the "Too many open files" issue (logs below), and I searched online and seems I need to set ulimit to a higher number. Since I am running the Spark job in AWS, and I don't know where the config file is, how can I pass that value in my Spark code?

Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 255 in stage 19.1 failed 4 times, most recent failure: Lost task 255.3 in stage 19.1 (TID 749786, 172.31.20.34, executor 207): java.io.FileNotFoundException: /media/ebs0/yarn/local/usercache/data-platform/appcache/application_1559339304634_2088/blockmgr-90a63e4a-dace-4246-a158-270b0b34c1f9/20/broadcast_13 (Too many open files)
daydayup
  • 2,049
  • 5
  • 22
  • 47
  • possible duplicate of [this](https://stackoverflow.com/questions/25707629/why-does-spark-job-fail-with-too-many-open-files) – Ram Ghadiyaram Jul 10 '19 at 20:36
  • The ulimit is a property of the system and user. https://unix.stackexchange.com/questions/8945/how-can-i-increase-open-files-limit-for-all-processes should show you how to change it. – tk421 Jul 11 '19 at 20:22

1 Answers1

0

Apart from changing the ulimit you should also look for connection leakages. For eg: Check if your i/o connections are properly closed. We saw Too many open files exception even with 655k ulimit on every node. Later we found the connection leakages in the code.

voldy
  • 359
  • 1
  • 8
  • 21