Questions tagged [livy]

Apache Livy is a service that enables easy interaction with a Spark cluster over a REST interface

What is Apache Livy?

Apache Livy is a service that enables easy interaction with a Spark cluster over a REST interface. It enables easy submission of Spark jobs or snippets of Spark code, synchronous or asynchronous result retrieval, as well as Spark Context management, all via a simple REST interface or a RPC client library. Apache Livy also simplifies the interaction between Spark from application servers, thus enabling the use of Spark for interactive web/mobile applications. Additional features include:

Have long running Spark Contexts that can be used for multiple Spark jobs, by multiple clients

Share cached RDDs or Dataframes across multiple jobs and clients

Multiple Spark Contexts can be managed simultaneously, and the Spark Contexts run on the cluster (YARN/Mesos) instead of the Livy Server, for good fault tolerance and concurrency

Jobs can be submitted as precompiled jars, snippets of code or via java/scala client API

Ensure security via secure authenticated communication

References

288 questions

votes

1 answer

Error when running Spark by livy

I am running my Spark job by using livy , however , I get below exception java.util.concurrent.ExecutionException: java.io.IOException: Internal Server Error: "java.util.concurrent.ExecutionException: org.apache.livy.rsc.rpc.RpcException:…

apache-spark livy

asked Nov 14 '17 at 07:08

Luckylukee

votes

0 answers

Best Practice to Generate PySpark Statements Using C#?

I am writing a ASP.Net Web API that at some point will communicate with an Apache Spark cluster. The communication is established using Livy server on the spark cluster that exposes a REST API interface and an HTTP client i wrote. In my business…

c# python asp.net apache-spark livy

asked Oct 25 '17 at 13:54

Anis Tissaoui

votes

2 answers

Zeppelin 0.7.2 version does not support spark 2.2.0

How to downgrade the spark version? What could be the other solutions? I have to connect my hive tables to spark using spark session. But the spark version is not supported by zeppelin.

apache-spark hive apache-zeppelin livy

asked Aug 21 '17 at 04:16

SHWETA WIN

votes

1 answer

livy server submits jar everytime a batch job is submitted

While submitting a Apache Spark batch job using Livy server,it uploads the jar file(containing application) everytime i.e., for every batch job submission.This seems to increase the job submission time.Is there a way to refer the jar present in the…

apache-spark livy bigdata

asked May 12 '17 at 10:06

Parithi

votes

3 answers

Livy pyspark Python Session Error in Jypyter with Spark Magic - ERROR repl.PythonInterpreter: Process has died with 1

I'm running a spark v2.0.0 YARN cluster. I have livy running beside the Spark master. I have set up a jupyter Python3 notetebook and have Spark Magic installed and have followed the nessesary instructions to connect Spark Magic to Livy although When…

apache-spark pyspark jupyter livy

asked Feb 10 '17 at 13:13

mildog8

2,030
2
22
36

votes

1 answer

Access a data file from the current livy session

I have a Spark Cluster running on Hadoop in YARN mode. I have configured Livy server to interact and submit client spark jobs to the spark cluster. I uploaded a data file along with the jar from the java program to Livy which gets uploaded in the…

java hadoop apache-spark livy

asked Jan 02 '17 at 07:08

msingh

votes

2 answers

execute Spark jobs, with Livy, using `--master yarn-cluster` without making systemwide changes

I'd like to execute a Spark job, via an HTTP call from outside the cluster using Livy, where the Spark jar already exists in HDFS. I'm able to spark-submit the job from shell on the cluster nodes, e.g.: spark-submit --class io.woolford.Main --master…

apache-spark livy

asked Nov 30 '16 at 05:09

Alex Woolford

4,433
11
47
80

vote

1 answer

How to mock connection for airflow's Livy Operator using unittest.mock

@mock.patch.dict( "os.environ", AIRFLOW_CONN_LIVY_HOOK = "http://www.google.com", clear= True ) class TestLivyOperator(unittest.TestCase): def setUp(self): super().setUp() self.dag = DAG( dag_id =…

airflow livy python-unittest.mock

asked Sep 06 '22 at 13:36

Priyanshu Sharma

vote

1 answer

Pyspark(via sparkmagic + livy) : There is insufficient memory for the Java Runtime Environment to continue

I'm using Sagemaker connecting to an EMR cluster via sparkmagic and livy, very frequently I get(at session startup, not running any code): > The code failed because of a fatal error: Session unexpectedly > reached final status 'dead'. See…

apache-spark pyspark amazon-emr livy

asked Jun 22 '22 at 19:26

Luis Leal

3,388
5
26
49

vote

1 answer

Can't init session in Spark. How to debug "User capacity has reached its maximum limit."?

I'm trying to create a session in Apache Spark using the Livy rest API. It fails with the following error: User capacity has reached its maximum limit.. The user is running another spark job. I don't understand which capacity reached its maximum and…

apache-spark hadoop hadoop-yarn livy

asked Jun 01 '22 at 19:30

neves

33,186
27
159
192

vote

0 answers

How to get the state of a remote job in Livy using Java API

Is it possible to monitor the state of an already running remote job in Livy with Java API? How can this be done? I looked over Livy Java API docs. A JobHandle would let me pool the state of the app. However, the only way I can see to obtain it is…

livy

asked Apr 30 '22 at 22:42

oskarryn

vote

0 answers

How can we connect to remote spark cluster via jupyterhub?

First of all my agenda is to be able to use spark codes inside jupyterhub. In other words I want to connect a remote spark cluster to jupyterhub. After searching about it I came up with two solutions:1)Livy and 2)spark magic. I have tried Livy but…

apache-spark jupyterhub livy

asked Feb 21 '22 at 14:04

FarzanehTaheri

vote

1 answer

Spark requests for more core than asked when calling POST livy batch api in azure synapse

I have an azure synapse spark cluster with 3 nodes of 4 vCores and 32 GB memory each. I am trying to submit a spark job using azure synapse Livy batch APIs. The request looks like this, curl --location --request POST…

apache-spark hadoop-yarn azure-synapse spark-submit livy

asked Feb 08 '22 at 11:11

aman_kumar

vote

1 answer

Databricks Notebook as Substitute for livy sessions endpoint

I want to execute a Databrikcs Notebook's code via Databricks API and get the output of notebook's code as response. Is it possible of is there any workaround for the same ? Is the same possible with Databricks SQL api ?

azure apache-spark databricks azure-databricks livy

asked Nov 01 '21 at 14:28

Shad Khan

vote

0 answers

PySpark batch job's configuration submitted through Apache Livy have no effect

I submitted spark batch job through Livy to the remote cluster with the following request body. REQUEST_BODY = { 'file': '/spark/batch/job.py', 'conf': { 'spark.sql.legacy.allowCreatingManagedTableUsingNonemptyLocation': 'true', …

apache-spark livy

asked Aug 28 '21 at 14:56

Korntewin Boonchuay

Prev 1 2 3

…

19 20 Next