11

I am running a Spark job implemented in Java using spark-submit. I would like to pass parameters to this job - e.g. a time-start and time-end parameter to parametrize the Spark application.

What I tried was using the

--conf key=value

option of the spark-submit script, but when I try to read the parameter in my Spark job with

sparkContext.getConf().get("key")

I get an exception:

Exception in thread "main" java.util.NoSuchElementException: key

Furthermore, when I use sparkContext.getConf().toDebugString() I don't see my value in the output.

Further Notice Since I want to submit my Spark Job via the Spark REST Service I cannot use an OS Environment Variable or the like.

Is there any possibility to implement this?

Michael Lihs
  • 7,460
  • 17
  • 52
  • 85
  • `key=value` in the example supposed to be `spark` configuration property or your "custom" property? – VladoDemcak Nov 10 '16 at 19:19
  • I want to have a "custom" property, accessible from within my Spark job (a Java application) – Michael Lihs Nov 10 '16 at 19:20
  • Possible duplicate of [How to pass -D parameter or environment variable to Spark job?](https://stackoverflow.com/questions/28166667/how-to-pass-d-parameter-or-environment-variable-to-spark-job) – Alex K Jun 07 '18 at 08:41
  • Checkout this post- https://stackoverflow.com/questions/31115881/how-to-load-java-properties-file-and-use-in-spark – Rahul Sharma Aug 31 '19 at 05:35

3 Answers3

12

Since you want to use your custom properties you need to place your properties after application.jar in spark-submit (like in spark example [application-arguments] should be your properties. --conf should be spark configuration properties.

--conf: Arbitrary Spark configuration property in key=value format. For values that contain spaces wrap “key=value” in quotes (as shown).

./bin/spark-submit \
  --class <main-class> \
  --master <master-url> \
  --deploy-mode <deploy-mode> \
  --conf <key>=<value> \
  ... # options
  <application-jar> \
  [application-arguments] <--- here our app arguments

so when you do: spark-submit .... app.jar key=value in main method you will get args[0] as key=value.

public static void main(String[] args) {
    String firstArg = args[0]; //eq. to key=value
}

but you want to use key value pairs you need to parse somehow your app arguments.

You can check Apache Commons CLI library or some alternative.

VladoDemcak
  • 4,893
  • 4
  • 35
  • 42
5

Spark configuration will use only keys in the spark namespace. If you don't won't to use independent configuration tool you can try:

--conf spark.mynamespace.key=value
2

You can pass parameters like this:

./bin/spark-submit \
  --class $classname \
  --master XXX \
  --deploy-mode XXX \
  --conf XXX \
  $application-jar --**key1** $**value** --**key2** $**value2**\

Make sure to replace key1, key2 and value with proper values.

g00glen00b
  • 41,995
  • 13
  • 95
  • 133
renzherl
  • 161
  • 1
  • 3