Questions tagged [livy]

Apache Livy is a service that enables easy interaction with a Spark cluster over a REST interface

From http://livy.incubator.apache.org.

What is Apache Livy?

Apache Livy is a service that enables easy interaction with a Spark cluster over a REST interface. It enables easy submission of Spark jobs or snippets of Spark code, synchronous or asynchronous result retrieval, as well as Spark Context management, all via a simple REST interface or a RPC client library. Apache Livy also simplifies the interaction between Spark from application servers, thus enabling the use of Spark for interactive web/mobile applications. Additional features include:

  • Have long running Spark Contexts that can be used for multiple Spark jobs, by multiple clients
  • Share cached RDDs or Dataframes across multiple jobs and clients
  • Multiple Spark Contexts can be managed simultaneously, and the Spark Contexts run on the cluster (YARN/Mesos) instead of the Livy Server, for good fault tolerance and concurrency
  • Jobs can be submitted as precompiled jars, snippets of code or via java/scala client API
  • Ensure security via secure authenticated communication

References

288 questions
0
votes
1 answer

Livy-Upload JAR in Driver classpath

I am unable to create livy interactive session with a dependent JAR required in driver-classpath, with the following command: curl -H "Content-Type: application/json" -X POST -d…
IshitaV
  • 73
  • 7
0
votes
0 answers

How to set LIVY_CONF_DIR in cloudera

I have installed livy server in cloudera in /usr/share. I want to change set the LIVY_CONF_DIR so that i can manage the config files like log4j.properties. Cloudera says this is possible but i could not find how to define…
0
votes
1 answer

Livy POST batche API with spark.driver.extraJavaOptions

How I can add spark.driver.extraJavaOptions with Livy POST/Batch API call? I need to pass additional -D (JVM system properties).
ASe
  • 535
  • 5
  • 15
0
votes
2 answers

How can I submit a jar with keyword parameters using livy?

I am using livy(post/batches) to submit a jar with keyword parameters. For example: spark-sumbit \ --class xxx \ --master xxx \ --conf xxx=aa \ my_test.jar --arg1 --arg2 In livy(post/batches), how can I do this? Does…
allinone
  • 1
  • 2
0
votes
0 answers

Debugging a Livy job

I have a PySpark job that I submit to Livy by using the Python client. I would like to debug it, and I've come across this article, but it's about Java and uses the configuration value of spark.driver.extraJavaOptions, while I need it to be in…
Bolchojeet
  • 455
  • 5
  • 14
0
votes
2 answers

Livy REST Spark java.io.FileNotFoundException:

I am newer in BigData, i have tried to call spark jobs with apache Livy . With submit command line works fine . with livy i have exception the command line : curl -X POST --data '{"file": "/user/romain/spark-examples.jar", "className":…
EL missaoui habib
  • 1,075
  • 1
  • 14
  • 24
0
votes
2 answers

How to invoke Python Jupyter Notebook via REST API hosted on Azure HDInsight?

I've existing HDInsight installation. On the same, I've created few files using PySpark with Python 3 support. I intend to make call to this Python notebook via REST API, and Livy Server seems to be the way forward. The problem that am facing is…
0
votes
1 answer

How long a livy session is maintained in head node memory?

Once a job is posted to livy, it creates a session for it. Then spark-submit, submits the job to yarn and yarn then executes the job. Till what point the session is maintained by livy in memory? Till submission to yarn or till it's execution is…
Sayantan Ghosh
  • 998
  • 2
  • 9
  • 29
0
votes
1 answer

livy-server build failing on Windows while running mvn -e clean install -DskipTests command in cmd prompt?

I'm executing cmd: mvn -e clean install -DskipTests on windows to set up apache livy which is giving livy-server build failure. The following are the error logs: [INFO] Livy Project Parent POM ............................ SUCCESS [ 3.803 s] [INFO]…
0
votes
2 answers

Docker - Possible to reference a file from a Docker container locally?

I have a Spark cluster running in a Docker container (using an image I made myself). It's all working fine. I now want to use Apache Livy and as per the documentation it says I need to get in a place a couple of environment…
userMod2
  • 8,312
  • 13
  • 63
  • 115
0
votes
1 answer

Why is spark-submit job leaving a process running on cluster (EMR) master node?

I am submitting a spark job to Livy through an AWS lambda function. The job runs to the end of the driver program but then does not shutdown. If spark.stop() or sc.stop() is added to the end of the driver program, the spark job finishes on the YARN…
Kieran
  • 316
  • 5
  • 15
0
votes
2 answers

Function to convert R types to Spark types

I have an R data frame that I would like to convert into a Spark data frame on a remote cluster. I have decided to write my data frame to an intermediate csv file that is then read using sparklyr::spark_read_csv(). I am doing this as the data frame…
Alex
  • 15,186
  • 15
  • 73
  • 127
0
votes
1 answer

NoSuchElementException while using toDF from Spark / Livy

I am attempting to produce a Spark Dataframe from within Spark, which has been initialised using apache Livy. I first noticed this issue on this more complicated hbase call: import spark.implicits._ ... spark.sparkContext …
ZenMasterZed
  • 203
  • 2
  • 8
0
votes
0 answers

How to add livy interpreter to Zeppelin running on EMR cluster

What is the simplest way to add livy interpreter to Zeppelin running on EMR cluster. What would be the right step to add to get it?
Bay Max
  • 222
  • 3
  • 13
0
votes
2 answers

How to execute a jar-packaged scala program via Apache Livy on Spark that responds with a result directly to a client request?

What I intend to achieve is having a Scala Spark program (in a jar) receive a POST message from a client e.g. curl, take some argument values, do some Spark processing and then return a result value to the calling client. From the Apache Livy…
Gerold
  • 1
  • 2