When running a python job in AWS Glue I get the error:
Reason: Container killed by YARN for exceeding memory limits. 5.6 GB of 5.5 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead
When running this in the beginning of the script:
print '--- Before Conf --'
print 'spark.yarn.driver.memory', sc._conf.get('spark.yarn.driver.memory')
print 'spark.yarn.driver.cores', sc._conf.get('spark.yarn.driver.cores')
print 'spark.yarn.executor.memory', sc._conf.get('spark.yarn.executor.memory')
print 'spark.yarn.executor.cores', sc._conf.get('spark.yarn.executor.cores')
print "spark.yarn.executor.memoryOverhead", sc._conf.get("spark.yarn.executor.memoryOverhead")
print '--- Conf --'
sc._conf.setAll([('spark.yarn.executor.memory', '15G'),('spark.yarn.executor.memoryOverhead', '10G'),('spark.yarn.driver.cores','5'),('spark.yarn.executor.cores', '5'), ('spark.yarn.cores.max', '5'), ('spark.yarn.driver.memory','15G')])
print '--- After Conf ---'
print 'spark.driver.memory', sc._conf.get('spark.driver.memory')
print 'spark.driver.cores', sc._conf.get('spark.driver.cores')
print 'spark.executor.memory', sc._conf.get('spark.executor.memory')
print 'spark.executor.cores', sc._conf.get('spark.executor.cores')
print "spark.executor.memoryOverhead", sc._conf.get("spark.executor.memoryOverhead")
I get following output:
--- Before Conf --
spark.yarn.driver.memory None
spark.yarn.driver.cores None
spark.yarn.executor.memory None
spark.yarn.executor.cores None
spark.yarn.executor.memoryOverhead None
--- Conf --
--- After Conf ---
spark.yarn.driver.memory 15G
spark.yarn.driver.cores 5
spark.yarn.executor.memory 15G
spark.yarn.executor.cores 5
spark.yarn.executor.memoryOverhead 10G
It seems like the spark.yarn.executor.memoryOverhead is set but why is it not recognized? I still get the same error.
I have seen other posts regarding problems with setting the spark.yarn.executor.memoryOverhead but not when it seems to be set and not working?