1

Our simple post request to livy for a self contained pyspark module works fine. However, we have reusable components being used by multiple pyspark modules. Moreover we have all our code being triggered from the main.py module using --job argument.

Below is the folder structure:

main.py
jobs.zip
     jobs
          job1
              __init__.py
          job2
              __init__.py

The following spark-submit command works fine. However we are trying to figure out how to pass the --job argument using livy api.

/usr/local/spark/bin/spark-submit \
--py-files jobs.zip \
src/main.py \
--job value1 

1 Answers1

2

Call the REST API to /batches end point, with the bellow sample JSON,

{"file":"Path to File containing the application to execute","args":["--job","value1"],"pyFiles":[List of Python files to be used in this session]}

Refer : https://livy.incubator.apache.org/docs/latest/rest-api.html

raman
  • 125
  • 2
  • 13