5

In Spark 2.0. How do you set the spark.yarn.executor.memoryOverhead when you run spark submit.

I know for things like spark.executor.cores you can set --executor-cores 2. Is it the same pattern for this property? e.g. --yarn-executor-memoryOverhead 4096

Giorgos Myrianthous
  • 36,235
  • 20
  • 134
  • 156
Micah Pearce
  • 1,805
  • 3
  • 28
  • 61

2 Answers2

18

Please find example. The values can also be given in Sparkconf.

Example:

./bin/spark-submit \
--[your class] \
--master yarn \
--deploy-mode cluster \
--num-exectors 17
--conf spark.yarn.executor.memoryOverhead=4096 \
--executor-memory 35G \  //Amount of memory to use per executor process 
--conf spark.yarn.driver.memoryOverhead=4096 \
--driver-memory 35G \   //Amount of memory to be used for the driver process
--executor-cores 5
--driver-cores 5 \     //number of cores to use for the driver process 
--conf spark.default.parallelism=170
 /path/to/examples.jar
Rahul
  • 399
  • 4
  • 10
7

spark.yarn.executor.memoryOverhead has now been deprecated:

WARN spark.SparkConf: The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead.


You can programmatically set spark.executor.memoryOverhead by passing it as a config:

spark = (
    SparkSession.builder
        .master('yarn')
        .appName('StackOverflow')
        .config('spark.driver.memory', '35g')
        .config('spark.executor.cores', 5)
        .config('spark.executor.memory', '35g')
        .config('spark.dynamicAllocation.enabled', True)
        .config('spark.dynamicAllocation.maxExecutors', 25)
        .config('spark.yarn.executor.memoryOverhead', '4096')
        .getOrCreate()
)
sc = spark.sparkContext
Giorgos Myrianthous
  • 36,235
  • 20
  • 134
  • 156
  • I am using spark 2.4.5 version if it is deprecated then why i am getting org.apache.spark.SparkException: Job aborted due to stage failure: Task 51 in stage 1218.0 failed 1 times, most recent failure: Lost task 51.0 in stage 1218.0 (TID 62209, dev1-zz-1a-10x24x96x95.dev1.grid.spgmidev.com, executor 13): ExecutorLostFailure (executor 13 exited caused by one of the running tasks) Reason: Container killed by YARN for exceeding memory limits. Consider boosting spark.yarn.executor.memoryOverhead or disabling yarn.nodemanager.vmem-check-enabled because of YARN-4714. – Shasu Jul 27 '21 at 12:42