AWS Glue executor memory limit

Question

I found that AWS Glue set up executor's instance with memory limit to 5 Gb --conf spark.executor.memory=5g and some times, on a big datasets it fails with java.lang.OutOfMemoryError. The same is for driver instance --spark.driver.memory=5g. Is there any option to increase this value?

I tried to run Glue job with parameters `--driver-memory 8g` and `--executor-memory 8g` but have no seen changes. Job still fails with `java.lang.OutOfMemoryError` trying to load data over 5gb — Alexey Bakulin, Mar 05 '18 at 07:37
@TofigHasanov still not. Please try solution from Kris Bravo https://stackoverflow.com/questions/49034126/aws-glue-executor-memory-limit/50122948#50122948 and let me know. Right now I have no ability to test it. Hope it works. — Alexey Bakulin, May 02 '18 at 19:08
Have you confirmed whether your changes been taken (in the log)? something like = --conf spark.executor.memory=8g — Ajith Kumara, Mar 15 '18 at 08:54
Yes, in logs I see that parameter `--executor-memory 8g` was passed in run parameters. But, as soon I can pass only _script_ parameters, I see 2 `--executor-memory`: first is part of spark job run parameters passed by Glue, and second is mine. Like this: `/usr/lib/spark/bin/spark-submit --master yarn --executor-memory 5g ... /tmp/runscript.py script_2018-03-16-11-09-28.py --JOB_NAME XXX --executor-memory 8g` After that, a log message like `18/03/16 11:09:31 INFO Client: Will allocate AM container, with 5632 MB memory including 512 MB overhead` — Alexey Bakulin, Mar 16 '18 at 11:20
I tried following setting with key as `--conf` and value as `spark.driver.extraClassPath=s3://temp/jsch-0.1.55.jar` for giving precedence to latest jar of jsch instead of the version that Glue is selecting but it doesn't work. Am I missing something. Also, as @rileyss mentioned, Glue documentation states that conf cannot be set. So, how should we go about resolving this? — Dwarrior, Mar 16 '19 at 04:32
The official docs at https://docs.aws.amazon.com/glue/latest/dg/monitor-profile-debug-oom-abnormalities.html covers this exact situation. — Benny, Jul 23 '19 at 23:31

xtreampb · Answer 1 · 2019-08-28T15:39:15.913

16

despite aws documentation stating that the --conf parameter should not be passed, our AWS support team told us to pass --conf spark.driver.memory=10g which corrected the issue we were having

edited Aug 28 '19 at 15:39

answered May 16 '19 at 20:51

xtreampb

528
6
19

score 11 · Answer 2 · answered May 01 '18 at 19:54

11

You can override the parameters by editing the job and adding job parameters. The key and value I used are here:

Key: --conf

Value: spark.yarn.executor.memoryOverhead=7g

This seemed counterintuitive since the setting key is actually in the value, but it was recognized. So if you're attempting to set spark.yarn.executor.memory the following parameter would be appropriate:

Key: --conf

Value: spark.yarn.executor.memory=7g

answered May 01 '18 at 19:54

Kris Bravo

141
1
6

Thanks Kris. I will test your solution as soon as I can. – Alexey Bakulin May 02 '18 at 19:10
1

I just added the following in my job section on my CloudFormation template, in the `DefaultArguments` part: `"--conf": "spark.yarn.executor.memory=8g"` without luck. The job fails with the message `Container killed by YARN for exceeding memory limits. 5.7 GB of 5.5 GB physical memory used.` I can actually see the parameter in the Job Parameters. – Xavi Jul 05 '18 at 07:22
1

I tried following setting with key as `--conf` and value as `spark.driver.extraClassPath=s3://temp/jsch-0.1.55.jar` for giving precedence to latest jar of jsch instead of the version that Glue is selecting but it doesn't work. Am I missing something. Also, as @rileyss mentioned, Glue documentation states that conf cannot be set. So, how should we go about resolving this? – Dwarrior Mar 16 '19 at 04:31
1

@Xavi It could very well be the driver's config you need to modify. E.g `"spark.driver.memory=8g"` – selle May 04 '20 at 18:43

score 10 · Answer 3 · answered Nov 29 '18 at 12:48

10

Open Glue> Jobs > Edit your Job> Script libraries and job parameters (optional) > Job parameters near the bottom
Set the following: key: --conf value: spark.yarn.executor.memoryOverhead=1024 spark.driver.memory=10g

answered Nov 29 '18 at 12:48

ashutosh singh

185
1
5

score 4 · Accepted Answer · answered Jul 24 '18 at 20:51

4

The official glue documentation suggests that glue doesn't support custom spark config.

There are also several argument names used by AWS Glue internally that you should never set:

--conf — Internal to AWS Glue. Do not set!

--debug — Internal to AWS Glue. Do not set!

--mode — Internal to AWS Glue. Do not set!

--JOB_NAME — Internal to AWS Glue. Do not set!

Any better suggestion on solving this problem?

answered Jul 24 '18 at 20:51

cozyss

1,290
1
15
22

Have you been able to figure out the resolution for this? I tried following setting with key as `--conf` and value as `spark.driver.extraClassPath=s3://temp/jsch-0.1.55.jar` for giving precedence to latest jar of jsch instead of the version that Glue is selecting but it doesn't work. Am I missing something? So, how should we go about resolving this? – Dwarrior Mar 16 '19 at 04:32
1

@Dwarrior I'm not sure if you can customize anything about spark on Glue. It seems that Glue runs on a pre-set environment and that's why it's cheap. My solution is dividing the input data into smaller chunks and run several glue jobs. If you really need to use customized spark settings, you can try AWS EMR, which gives you much more freedom in adjusting spark parameters. – cozyss Mar 21 '19 at 17:08
thanks! Will explore the other options. I fathomed from other answers that some settings did work. :) – Dwarrior Mar 22 '19 at 13:47
Pay attention! Answer from cozyss is not correct! You can set custom Spark parameters for Glue jobs as described here -> https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-glue-arguments.html You can, for example, set "spark.sql.autoBroadcastJoinThreshold" as follow: Job parameters key: --conf Job parameters value: spark.sql.autoBroadcastJoinThreshold=-1 – sromano Jun 20 '23 at 10:47

score 1 · Answer 5 · answered Feb 11 '19 at 16:11

I hit out of memory errors like this when I had a highly skewed dataset. In my case, I had a bucket of json files that contained dynamic payloads that were different based on the event type indicated in the json. I kept hitting Out of Memory errors no matter if I used the configuration flags indicated here and increased the DPUs. It turns out that my events were highly skewed to a couple of the event types being > 90% of the total data set. Once I added a "salt" to the event types and broke up the highly skewed data I did not hit any out of memory errors.

Here's a blog post for AWS EMR that talks about the same Out of Memory error with highly skewed data. https://medium.com/thron-tech/optimising-spark-rdd-pipelines-679b41362a8a

score 0 · Answer 6 · edited Jan 12 '22 at 21:19

0

You can use Glue G.1X and G.2X worker types which give more memory and disk space to scale Glue jobs that need high memory and throughput. Also you can edit Glue job and set --conf value spark.yarn.executor.memoryOverhead=1024 or 2048 and spark.driver.memory=10g

edited Jan 12 '22 at 21:19

glennsl

28,186
12
57
75

answered Jan 12 '22 at 17:22

Tanmoy Dasgupta

21
1

AWS Glue executor memory limit

6 Answers6

Linked