4

I've defined a default configuration in my Spark application which is tucked in src/main/resources/reference.conf. I use ConfigFactory.load() to obtain the configuration.

When I run the application with spark-submit it picks up these defaults. However, when I only want to override a few of the configurations available in reference.conf and provide application.conf, it does not seem to pick up these overrides. From the documentation I thought that application.conf is merged with reference.conf when calling load(), so that it's not necessary to re-define everything in application.conf.

My reference.conf looks like this:

hdfs {
  rootDir: "/foo"
  dataDir: "hdfs://"${hdfs.rootDir}"/bar"
}

db {
  driver: "com.mysql.jdbc.Driver"
  ...
}

...

What I'd now like to do is have an application.conf with, say, only a custom hdfs section because the rest is the same.

I run my Spark app by supplying application.conf in both the --files parameter, in --driver-class-path, and --conf spark.executor.extraClassPath. This may be overkill but it works when I create a copy of reference.conf and change a few of the fields.

What am I missing?

Ian
  • 1,294
  • 3
  • 17
  • 39
  • What path are you setting for the executor class path? – Yuval Itzchakov Jun 28 '16 at 07:38
  • Isn't that the one I have listed with `conf`? That one is colon-separated. `files` is comma-separated. – Ian Jun 28 '16 at 07:41
  • 1
    One thing you have to be aware of is that the path for `--driver-class-path` isn't the same for `spark.exector.extraClassPath`, meaning if you set for example `--driver-class-path "/opt/bla/application.conf"`, the equivalent for the executor need only be `--conf "spark.executor.extraClassPath=application.conf"`, since `--file` will dump it in the working directory where the executor launches the uber JAR. – Yuval Itzchakov Jun 28 '16 at 07:43
  • So I can leave off the `files` parameter. Got it. Would this be OK then: `spark.executor.extraClassPath=/usr/lib/foo/bar.jar:/foo/bar/application.conf`? – Ian Jun 28 '16 at 08:45
  • 1
    No, you *need* the `--files` parameter in order to send `application.conf` to the worker nodes. The path needs to be: `spark.executor.extraClassPath=bar.jar:application.conf` – Yuval Itzchakov Jun 28 '16 at 08:52
  • Understood. I'll change it and try again. – Ian Jun 28 '16 at 09:36

0 Answers0