1

I am using databricks rest API to run spark jobs. I am using the foollowing commands:

curl -X POST -H "Authorization: XXXX" 'url/api/2.0/jobs/create' -d ' {"name":"jobname","existing_cluster_id":"0725-095337-jello70","libraries": [{"jar": "dbfs:/mnt/pathjar/name-9edeec0f.jar"}],"email_notifications":{},"timeout_seconds":0,"spark_jar_task": {"main_class_name": "com.company.DngApp"}}'

curl -X POST -H "Authorization: XXXX" 'url/api/2.0/jobs/run-now' -d '{"job_id":25854,"jar_params":["--param","value"]}'

here param is an input args but I want to find a way to override spark driver properties, usually I do :

--driver-java-options='-Dparam=value'

but I am looking for the equivalent for the databricks rest API side

scalacode
  • 1,096
  • 1
  • 16
  • 38

1 Answers1

1

You cannot use "--driver-java-options" in Jar params.

Reason:

Note: Jar_params is a list of parameters for jobs with JAR tasks, e.g. "jar_params": ["john doe", "35"].

The parameters will be used to invoke the main function of the main class specified in the Spark JAR task. If not specified upon run-now, it will default to an empty list. jar_params cannot be specified in conjunction with notebook_params. The JSON representation of this field (i.e. {"jar_params":["john doe","35"]}) cannot exceed 10,000 bytes.

enter image description here

For more details, Azure Databricks - Jobs API - Run Now.

You can use spark_conf to pass in a string of user-specified spark configuration key-value pairs.

An object containing a set of optional, user-specified Spark configuration key-value pairs. You can also pass in a string of extra JVM options to the driver and the executors via spark.driver.extraJavaOptions and spark.executor.extraJavaOptions respectively.

Example Spark confs: {"spark.speculation": true, "spark.streaming.ui.retainedBatches": 5} or {"spark.driver.extraJavaOptions": "-verbose:gc -XX:+PrintGCDetails"}

For more details, refer "NewCluster configuration".

Hope this helps.

CHEEKATLAPRADEEP
  • 12,191
  • 1
  • 19
  • 42