Questions tagged [livy]

Apache Livy is a service that enables easy interaction with a Spark cluster over a REST interface

From http://livy.incubator.apache.org.

What is Apache Livy?

Apache Livy is a service that enables easy interaction with a Spark cluster over a REST interface. It enables easy submission of Spark jobs or snippets of Spark code, synchronous or asynchronous result retrieval, as well as Spark Context management, all via a simple REST interface or a RPC client library. Apache Livy also simplifies the interaction between Spark from application servers, thus enabling the use of Spark for interactive web/mobile applications. Additional features include:

  • Have long running Spark Contexts that can be used for multiple Spark jobs, by multiple clients
  • Share cached RDDs or Dataframes across multiple jobs and clients
  • Multiple Spark Contexts can be managed simultaneously, and the Spark Contexts run on the cluster (YARN/Mesos) instead of the Livy Server, for good fault tolerance and concurrency
  • Jobs can be submitted as precompiled jars, snippets of code or via java/scala client API
  • Ensure security via secure authenticated communication

References

288 questions
0
votes
0 answers

Livy logs location(in S3) for an EMR cluster(debuging Neither SparkSession nor HiveContext/SqlContext is available)

I'm using AWS SageMaker connected to an EMR cluster via Livy, with a "normal" session(default session config) the connection is created, and spark context works fine. but when…
Luis Leal
  • 3,388
  • 5
  • 26
  • 49
0
votes
1 answer

Apache Livy : Could not find or load main class org.apache.livy.server.LivyServer

I am trying to start Apache Livy 0.8.0 server on my windows 10 machine for spark 3.1.2 and hadoop 3.2.1. I am taking help from here.. I have successfully built apache livy using maven (I have attached a of it) But I am not able to run the livy…
Mirza Asad
  • 606
  • 1
  • 7
  • 18
0
votes
0 answers

Orchestration pyspark scripts in aws step functions

My spark job(we are processing 100 GB of data) taking one hour to complete.in step functions i am using lambda to submit our job through Livy!! and created one more lambda function to get the job status. issue here is , after 15 mins the step…
0
votes
1 answer

Livy on K8S, namespace restriction

I have spark (3.0.1), livy (0.8.0) and Jupyterhub (sparkmagic) running on K8S in specific namespace, Kubernetes master is used as a resource manager. When trying to create pyspark session in Jupyterhub's notebook I get the error: 22/02/04 12:09:16…
Artyom Rebrov
  • 651
  • 6
  • 23
0
votes
1 answer

I have trouble with using apache-livy, my interpreter doesn't work

I am currently trying to make a REST API server with apache livy. I started a livy-server and with my python code I used the livy api. Here's my python code that interacts with the livy-server import json, pprint, requests, textwrap #start…
cicada
  • 19
  • 2
0
votes
1 answer

How do I set up sparkmagic to work with DataProc through Livy?

I have a DataProc cluster running in GCP. I ran the Livy initialization script for it, and I can access the livy/sessions link through the gateway interface. I have the following set up for my sparkmagic config.json: { …
oneextrafact
  • 159
  • 1
  • 9
0
votes
0 answers

How to create a cached context in Spark using Apache Livy?

I would like to create a cached context using Apache Livy, such that while submitting spark jobs, it does not require creating a new context every time.
0
votes
1 answer

AWS - Lambda cannot access Livy endpoint for EMR

For reference, livy is a rest endpoint used to pull data from a cluster. Within the same account, my lambda function always times out when attempting to access the livy endpoint using by…
Jeremy
  • 5,365
  • 14
  • 51
  • 80
0
votes
2 answers

need help on submitting hudi delta streamer job via apache livy

I am little confused with how to pass the arguments as REST API JSON. Consider below spark submit command. spark-submit --packages org.apache.hudi:hudi-utilities-bundle_2.11:0.5.3,org.apache.spark:spark-avro_2.11:2.4.4 \ --master yarn \ …
0
votes
1 answer

spark job submitted via Livy throws GSSException: No valid credentials provided (Mechanism level: Failed to find any kerberos tgt)

I am trying to launch my spark batch job using livy. From the logs , i see that the start running but fails when it tries to access hive metastore with the following kerberos error: GSSException: No valid credentials provided (Mechanism level:…
smang
  • 95
  • 1
  • 8
0
votes
1 answer

File Paths become hidden unaccessible when using Kerberos Authentication and Livy (via sparkmagic)

I am using this quickstart guide (https://github.com/aws-quickstart/quickstart-hail) when setting up EMR with sagemaker. Due to security requirements, I had to enable kerberos (local KDC within EMR cluster) and I referenced this guide…
Reivax
  • 33
  • 2
0
votes
3 answers

How can I run zeppelin with keberos in CDH 6.3.2

zeppelin 0.9.0 does not work with Kerberos I have add "zeppelin.server.kerberos.keytab" and "zeppelin.server.kerberos.principal" in zeppelin-site.xml But I aldo get error "Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host…
ighack
  • 31
  • 4
0
votes
1 answer

Pass command line arguments to Livy - Interactive session

I tried deploying a pyspark application from Jenkins, through livy interactive session. livy_submit --livy-url -s sample.py I have a scenario where we would need to pass external parameters, something similar to "--args" that is used in…
0
votes
1 answer

Module error caused from AWS EMR by running PySpark code in Apache Livy via lambda function

I am running a pyspark code in an AWS EMR cluster. I gave the spark properties in livy application via lambda function. import requests import json def lambda_handler(event, context): master_dns = event.get('clusterDetails', {}).get('Cluster',…
0
votes
1 answer

Converting spark-submit into Livy REST JSON protocol

I am trying to rewrite spark-submit which has arguments like packages, repositories, jars, files, arguments defined by users like this into Livy REST JSON Protocol. please find more details below. spark-submit command: spark-submit \ --packages…