Questions tagged [livy]

Apache Livy is a service that enables easy interaction with a Spark cluster over a REST interface

From http://livy.incubator.apache.org.

What is Apache Livy?

Apache Livy is a service that enables easy interaction with a Spark cluster over a REST interface. It enables easy submission of Spark jobs or snippets of Spark code, synchronous or asynchronous result retrieval, as well as Spark Context management, all via a simple REST interface or a RPC client library. Apache Livy also simplifies the interaction between Spark from application servers, thus enabling the use of Spark for interactive web/mobile applications. Additional features include:

  • Have long running Spark Contexts that can be used for multiple Spark jobs, by multiple clients
  • Share cached RDDs or Dataframes across multiple jobs and clients
  • Multiple Spark Contexts can be managed simultaneously, and the Spark Contexts run on the cluster (YARN/Mesos) instead of the Livy Server, for good fault tolerance and concurrency
  • Jobs can be submitted as precompiled jars, snippets of code or via java/scala client API
  • Ensure security via secure authenticated communication

References

288 questions
1
vote
1 answer

When trying to use pyspark with Livy, I get PYSPARK_GATEWAY_SECRET error

When starting pyspark on the command line using pyspark, everything works as expected. However, when using Livy, it doesn't. I made the connection using Postman. First I POST this to the sessions endpoint: { "kind": "pyspark", "proxyUser":…
rabejens
  • 7,594
  • 11
  • 56
  • 104
1
vote
1 answer

Application uploaded to Apache Livy fails if not compiled with all dependencies jars

I'm submitting a batch Job, the Pi Job, with a curl command to Livy , but it fails because the java.lang.ClassNotFoundException: org.apache.livy.Job. If I compile my jar with all the dependencies inside the jar file then it works. Why do I need to…
1
vote
0 answers

Migrating from Spark Jobserver to Apache Livy

I have been working with a standalone Spark Server with Jobserver. For x reasons I had to migrate to an Ambari Cluster and then as already have Livy I think is better to use it instead of Jobserver. Now I'm lost trying to migrate my actual Java…
1
vote
1 answer

Use existing SparkSession in POST/batches request

I'm trying to use Livy to remotely submit several Spark jobs. Lets say I want to perform following spark-submit task remotely (with all the options as-such) spark-submit \ --class com.company.drivers.JumboBatchPipelineDriver \ --conf…
y2k-shubham
  • 10,183
  • 11
  • 55
  • 131
1
vote
1 answer

Spark AppName in request body of Apache Livy for batch

How to set Spark App-Name while submitting batch job from Apache Livy?
Apurba Pandey
  • 1,061
  • 10
  • 21
1
vote
0 answers

Importing dependencies with Livy for Zeppelin and HDInsights Spark

I am trying to write a HDInsight Spark application which reads streaming data from an Azure EventHub. I am using a Zeppelin notebook with the Livy interpreter. I need to import the dependency com.microsoft.azure:azure-eventhubs-spark_2.11:2.3.2 and…
sil
  • 433
  • 8
  • 20
1
vote
0 answers

Kerberos parameter in sparklyr livy protocol

I want to connect with a domain joined HDInsight cluster in Azure from a domain joined VM. Unfortunately, I don't know how to make sure that sparklyr starts a session with Kerberos authentication. sc <- sparklyr::spark_connect(master =…
JanBennk
  • 277
  • 7
  • 16
1
vote
1 answer

Configuring Livy with Cloudera 5.14 and Spark2: Livy can't find its own JAR files

I'm new to Cloudera, and am attempting to move workloads from a HDP server running Ambari with Livy and Spark 2.2.x to a CDH 5 server with a similar setup. As Livy is not a component of Cloudera, I'm using version 0.5.0-incubating from their…
stuart
  • 1,005
  • 1
  • 10
  • 18
1
vote
1 answer

How can I add a jar to a running spark context?

To elaborate, I am using livy to create a spark session and then I submit my jobs to the livy client which runs them in the same spark session. Now, if I need to add a new jar as a dependency in one of the jobs, is there any way to put the jar in…
1
vote
3 answers

Run livy job via http without uploading jar every time

I'm playing around with Livy/Spark and am a little confused on how to use some of it. There's an example in the livy examples folder of building jobs that get uploaded to spark. I like the interfaces that are being used, but I want to interface to…
Exuro
  • 229
  • 3
  • 15
1
vote
1 answer

How to set --master, --deploy-mode, --driver-class-path and --driver-java-options through Apache Livy?

I want to set the master, spark deploy-mode, driver-class-path and driver-java-options for the Spark job when the job is triggered through Apache Livy without having to restart the Livy server when these settings change. How to do this since there…
1
vote
2 answers

Fault Tolerance in Apache Livy

Anyone having some insights regarding achieving fault tolerance in Apache Livy. Say for instance the Livy server fails how we can achieve HA.
Sumit Khurana
  • 159
  • 1
  • 10
1
vote
0 answers

Zeppelin --> Shiro --> Livy Integration Error: Cannot start spark | livy is not allowed to impersonate user1

I am facing an issue with Zeppelin --> Shiro --> Livy Integration. It would be great if someone could help me on this. My current environment set up as follows: • 1 Master node and 2 slave nodes running. • Zeppelin installed on Master node up…
1
vote
1 answer

How to set spark.driver.extraClassPath through Apache Livy on Azure Spark cluster?

I would would like to add some configuration when a Spark Job is submitted via Apache Livy into an Azure cluster. Currently to launch a spark Job via Apache Livy in the cluster, I use the following command curl -X POST --data '{"file":…
moun
  • 69
  • 1
  • 6
1
vote
1 answer

How to submit Spark jobs to Apache Livy?

I am trying to understand how to submit Spark job to Apache Livy. I added the following API to my POM.xml: com.cloudera.livy livy-api 0.3.0
Markus
  • 3,562
  • 12
  • 48
  • 85