Questions tagged [livy]

Apache Livy is a service that enables easy interaction with a Spark cluster over a REST interface

From http://livy.incubator.apache.org.

What is Apache Livy?

Apache Livy is a service that enables easy interaction with a Spark cluster over a REST interface. It enables easy submission of Spark jobs or snippets of Spark code, synchronous or asynchronous result retrieval, as well as Spark Context management, all via a simple REST interface or a RPC client library. Apache Livy also simplifies the interaction between Spark from application servers, thus enabling the use of Spark for interactive web/mobile applications. Additional features include:

  • Have long running Spark Contexts that can be used for multiple Spark jobs, by multiple clients
  • Share cached RDDs or Dataframes across multiple jobs and clients
  • Multiple Spark Contexts can be managed simultaneously, and the Spark Contexts run on the cluster (YARN/Mesos) instead of the Livy Server, for good fault tolerance and concurrency
  • Jobs can be submitted as precompiled jars, snippets of code or via java/scala client API
  • Ensure security via secure authenticated communication

References

288 questions
1
vote
2 answers

Is there any other configuration to be done left over along with Livy server(livy.conf)?

I have setup the docker for the hadoop yarn, and I am trying to setup the livy apache server to make API call's for job submissions. The logs below represents that the livy-server starts for certain time and stops automatically 19/08/17 07:09:35…
Karthik
  • 23
  • 1
  • 6
1
vote
1 answer

Submitting pyspark job through livy using user defined parameters

Our simple post request to livy for a self contained pyspark module works fine. However, we have reusable components being used by multiple pyspark modules. Moreover we have all our code being triggered from the main.py module using --job…
1
vote
1 answer

I need help using django_cron

I am currently working with HDFS, Apache Livy and Django, the goal of this is to send a request to get some code running which is stored in HDFS and which calls Livy to create Batches. For now, everything is working, I have a basic wordcount stored…
Bromania
  • 15
  • 7
1
vote
1 answer

LivyClient uploadJar failing with py4j.Py4JException: Error while obtaining a new communication channel

I am trying to submit a spark job through Apache Livy but the LivyClient's uploadJar method is failing. This is the code (very similar to the PiJob example): LivyClientBuilder builder = new LivyClientBuilder(); LivyClient client =…
1
vote
0 answers

Ports used by Sparklyr with LIVY

Is there any other port needed to use LIVY with Sparklyr except the LIVY port? (with default 8998). I have two machines from which I try to use Spakrlyr with LIVY: my local Windows station in network zone A a remote linux server with R in network…
mattino
  • 95
  • 7
1
vote
1 answer

AWS EMR Livy sessions state dead

I'm using EMR with livy, but Livy kill some sessions, is there any way to wait for other tasks to complete instead of killing those sessions? Thanks, Here is the output for those killing sessions: Warning: Ignoring non-spark config property:…
1
vote
0 answers

What happens to a spark applications requesting more memory than than the cluster has?

If there is a spark cluster with worker nodes of say x GB memory, and there are 5 such worker nodes what would happen to a applicaion if: 1. Driver memory requested in the application is > x GB 2. Driver Memory + Executor Memory * Number of…
Sayantan Ghosh
  • 998
  • 2
  • 9
  • 29
1
vote
2 answers

How to submit PySpark and Python jobs to Livy

Ii am trying to submit a PySpark job to Livy using the /batches endpoint, but I haven't found any good documentation. Life has been easy because we are submitting Scala-compiled JAR files to Livy, and specifying the job with className. For the JAR…
Eric Meadows
  • 887
  • 1
  • 11
  • 19
1
vote
0 answers

Spark throws AnalysisException: Undefined function: 'count' for spark built in function

If i run the following code in spark ( 2.3.2.0-mapr-1901) , it runs fine on the first run. SELECT count( `cpu-usage` ) as `cpu-usage-count` , sum( `cpu-usage` ) as `cpu-usage-sum` , percentile_approx( `cpu-usage`, 0.95 ) as…
ZenMasterZed
  • 203
  • 2
  • 8
1
vote
1 answer

Sending JSON argument as a livy parameter in Java

I am trying to submit a spark job through livy. In the job , I need to post a json data in the args parameter while invoking livy.This is what I have done String payload = "{\"name\": \"myname\", \"id\": \"101\"}"; String data="{…
Ayan Biswas
  • 1,641
  • 9
  • 39
  • 66
1
vote
0 answers

How to reduce turn around time of Livy

In my application I submit spark job through livy and get back the result by uploading the jar file every time to the cluster, but the problem is it takes 20 seconds to give the results back. Is there any way I could reduce the time taken by livy…
Pyd
  • 6,017
  • 18
  • 52
  • 109
1
vote
1 answer

Apache/Cloudera HUE / Livy Spark Server - InterpreterError: Fail to start interpreter

I'm at a loss at this point. I'm trying to run PySpark/SparkR on Apache HUE 4.3, using Spark 2.4 + Livy Server 0.5.0. I've followed every guide I can find, but I keep running into this issue. Basically, I can run PySpark/SparkR through command line,…
jhomr
  • 477
  • 3
  • 16
1
vote
1 answer

Trigger spark application with Apache Livy with parameters

I am trying to trigger a spark application from Apache Livy however I cannot seem to get it to work. I am using latest version (0.5) and passing args based on the documentation https://livy.incubator.apache.org/examples/ however from the logs when…
Mez
  • 4,666
  • 4
  • 29
  • 57
1
vote
1 answer

Apache Spark and Livy cluster

Scenario : I have spark cluster and I also want to use Livy. I am new about Livy Problem : I built my spark cluster by using docker swarm and I will also create a service for Livy. Can Livy communicate with external spark master and send a job…
ugur
  • 400
  • 6
  • 20
1
vote
1 answer

Unable to kill application from Yarn RM UI

I have dataproc setup on google cloud platform with apache livy installed. I am submitting jobs using livy rest api. When I try to kill livy jobs from Yarn RM, I am getting below error in browser console…
Deepak Verma
  • 653
  • 1
  • 10
  • 24