13

Multiple Answers on stackoverflow for AWS Glue say to set the --conf table parameter. However, sometimes in a job we'll need to set multiple --conf key value pairs in 1 job.

I've tried the following ways to have multiple --conf values set all resulting in error:

  • add another table parameter called --conf. This results in the AWS Dashboard removing the 2nd parameter named --conf and sets focus to the value of the 1st parameter named --conf. Terraform also just considers both table parameters with key --conf to be equal and overwrites the value in the 1st parameter with the 2nd's value.
  • separate the config key value parameters with a space in the value of the table --conf parameter. E.G. spark.yarn.executor.memoryOverhead=1024 spark.yarn.executor.memoryOverhead=7g spark.yarn.executor.memory=7g. This results in a failure to start the job.
  • separate the config key value parameters with a comma in the value of the table --conf parameter. E.G. spark.yarn.executor.memoryOverhead=1024, spark.yarn.executor.memoryOverhead=7g, spark.yarn.executor.memory=7g. This results in a failure to start the job.
  • set the value of the --conf to have --conf string separate each key value. E.G. spark.yarn.executor.memoryOverhead=1024 --conf spark.yarn.executor.memoryOverhead=7g --conf spark.yarn.executor.memory=7g. This results in the glue job hanging.

How do I set multiple --conf table parameters in AWS Glue?

Zambonilli
  • 4,358
  • 1
  • 18
  • 18

2 Answers2

25

You can pass multiple parameters as below:

Key: --conf

value: spark.yarn.executor.memoryOverhead=7g --conf spark.yarn.executor.memory=7g

This has worked for me.

Nihir
  • 506
  • 5
  • 3
-2

You can override the parameters by editing the job and adding job parameters. The key and value I used are here:

Key: --conf

Value: spark.yarn.executor.memoryOverhead=7g

This seemed counterintuitive since the setting key is actually in the value, but it was recognized. So if you're attempting to set spark.yarn.executor.memory the following parameter would be appropriate:

Key: --conf

Value: spark.yarn.executor.memory=7g

Find more information(I've add this answer from this): https://stackoverflow.com/a/50122948/10968161

Community
  • 1
  • 1