Why can't I change "spark.driver.memory" value in AWS Elastic Map Reduce?

Question

I want to tune my spark cluster on AWS EMR and I couldn't change the default value of spark.driver.memory which leads every spark application to crash as my dataset is big.

I tried editing the spark-defaults.conf file manually on the master machine, and I also tried configuring it directly with a JSON file on EMR dashboard while creating the cluster.

Here's the JSON file used:

[
  {
    "Classification": "spark-defaults",
    "Properties": {
      "spark.driver.memory": "7g",
      "spark.driver.cores": "5",
      "spark.executor.memory": "7g",
      "spark.executor.cores": "5",
      "spark.executor.instances": "11"
      }
  }
]

After using the JSON file, the configurations are correctly found in the "spark-defaults.conf" but on spark dashboard there's always the default value for "spark.driver.memory" of 1000M while the other values are modified correctly. Anyone have got into the same problem please? Thank you in advance.

Did you ever find an answer for this? – gbeaven Jul 19 '21 at 17:50 — gbeaven, Jul 19 '21 at 17:50

score 0 · Answer 1 · answered Apr 11 '19 at 15:55

0

You need to set

maximizeResourceAllocation=true

in the spark-defaults settings

[
   {
    "Classification": "spark",
    "Properties": {
       "maximizeResourceAllocation": "true"
    }
  }
]

answered Apr 11 '19 at 15:55

Vishnu

725
4
11
26

I actually used that before trying to use manual values but the spark.driver.memory value didn't change at all.. – yassidhbi Apr 12 '19 at 08:18

Why can't I change "spark.driver.memory" value in AWS Elastic Map Reduce?

1 Answers1