0

I am having an issue where spark my executors are running out of permgen space. I launch my spark jobs using oozie's spark action. The same jobs run without issue if I launch them manually using spark-submit. I have tried increasing the permgen space using each of these methods

    1.  ---conf spark.executor.extraJavaOptions=-XX:MaxPermSize=2g
      as a command-line option in <spark-opts> in oozie's workflow.xml
    2. Increasing the oozie.launcher.mapreduce.child.java.opts in global <configuration> in oozie's workflow.xml
    3. Programmatically in spark code, when SparkConf is created, using sparkConf.set("spark.executor.extraJavaOptions","-XX:MaxPermSize=2g") 

Spark UI executor indicates that the property is successfully set. But when the executors are launched, they have the default of 256M in addition to the passed parameter which appears in single quotes :

    {{JAVA_HOME}}/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms6144m -Xmx6144m '-XX:MaxPermSize=2g' -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.driver.port=45057' -Dspark.yarn.app.container.log.dir=<LOG_DIR> -XX:MaxPermSize=256m org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@10.0.0.124:45057 --executor-id 3 --hostname ip-10-0-0-126.us-west-2.compute.internal --cores 4 --app-id application_1471652816867_0103 --user-class-path file:$PWD/__app__.jar 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr

Any ideas on how to increase the PermGen space ? I am running spark 1.6.2 with hadoop 2.7.2. Oozie version is 4.2

bdguy
  • 213
  • 1
  • 2
  • 6
  • Oozie `` properties don't apply to all Actions -- see the first part of that answer for more details: http://stackoverflow.com/questions/38337362/oozie-properties-defined-in-file-referenced-in-global-job-xml-not-visible-in-wo/38338713#38338713 – Samson Scharfrichter Aug 22 '16 at 20:38
  • So you have two more options: modify property `spark.executor.extraJavaOptions` in the **config file** `spark-defaults.conf`; or set Oozie property `mapreduce.map.java.opts` directly in the Spark Action properties (yes, without `oozie.launcher.` because that prefix applies only the the "launcher" job that runs the Spark driver; the base Hadoop props are passed to the child YARN jobs i.e. the Spark executors) – Samson Scharfrichter Aug 22 '16 at 20:45
  • BTW, I never could figure out what the `child.java.opts`property is supposed to do. – Samson Scharfrichter Aug 22 '16 at 20:48
  • Side note: I really hope you run Oozie 4.2 and not 2.4 0:-) – Samson Scharfrichter Aug 22 '16 at 20:49
  • Thx Samson for the comments and the correction for Oozie version :) The properties do get passed from conf as indicated in the java command of the executor launch, but the default of 256M doesnt get overwritten...the extraJavaOptions are just appended :( – bdguy Aug 22 '16 at 22:14
  • Did you search all your `*-site.xml` config files, both client-side and YARN-server-side, to find out where that damn `-XX:MaxPermSize=256m` is set? With any luck it will be client-side, and you can change it on all nodes (or ship a custom copy of the XML files via Oozie `` commands, and force the Spark driver to use these with env. variable `HADOOP_CONF_DIR`) – Samson Scharfrichter Aug 23 '16 at 09:11

0 Answers0