3

I want to submit batch jar Spark jobs using livy Programmatic API, like using rest API batches, I have the json data

{
    "className": "org.apache.spark.examples.SparkPi",
    "queue": "default",
    "name": "SparkPi by Livy",
    "proxyUser": "hadoop",
    "executorMemory": "5g",
    "args": [2000],
    "file": "hdfs://host:port/resources/spark-examples_2.11-2.1.1.jar"
}

but I cannot find any document about this, is this possible? how?

寒江雪
  • 71
  • 5

1 Answers1

3

Yes, you can submit spark jobs via rest API using Livy. Please follow the below steps,

  • First build spark application and create the assembly jar and upload the application jar on the cluster storage (HDFS) of the hadoop cluster.
  • Submit job using either curl (for testing) and implement using http client api.

Sample code to submit spark job using http client in scala

import org.apache.http.client.methods.{CloseableHttpResponse, HttpGet, 
HttpPost, HttpPut}
import org.apache.http.entity.StringEntity
import org.apache.http.impl.client.{CloseableHttpClient, HttpClientBuilder}
import org.apache.http.util.EntityUtils

import scala.util.parsing.json.{JSON, JSONObject}

def submitJob(className: String, jarPath:String, extraArgs: List[String]) : JSONObject = {

val jobSubmitRequest = new HttpPost(s"${clusterConfig.livyserver}/batches")

val data =  Map(
"className"-> className,
"file" -> jarPath,
"driverMemory" -> "2g",
"name" -> "LivyTest",
"proxyUser" -> "hadoop")

if(extraArgs != null && !extraArgs.isEmpty) {
 data  + ( "args" -> extraArgs)
}

val json = new JSONObject(data)

println(json.toString())

val params = new StringEntity(json.toString(),"UTF-8")
params.setContentType("application/json")

jobSubmitRequest.addHeader("Content-Type", "application/json")
jobSubmitRequest.addHeader("Accept", "*/*")
jobSubmitRequest.setEntity(params)

val client: CloseableHttpClient = HttpClientBuilder.create().build()
val response: CloseableHttpResponse = client.execute(jobSubmitRequest)
HttpReqUtil.parseHttpResponse(response)._2
}

Please refer the post for more details https://www.linkedin.com/pulse/submitting-spark-jobs-remote-cluster-via-livy-rest-api-ramasamy/

Sample project in the following link https://github.com/ravikramesh/spark-rest-service

Ravikumar
  • 1,121
  • 1
  • 12
  • 23