1

My apology if somewhere I made a mistake in my language.

I want to install the Apache Livy server on a node(VM Instance) outside the Spark cluster. How can I do this so that LivyServer should point to the Spark cluster?

I have downloaded and installed livy on VM instance using

git clone https://github.com/cloudera/livy.git
cd livy
mvn clean package -DskipTests

made changes in livy/conf/livy.conf

livy.spark.master = spark://{spark-cluster-master_IP}:7077
livy.spark.deploy-mode = cluster

livy server started using command

livy/bin/livy-server start

And trying to interact using REST api of python

>>> import json, pprint, requests, textwrap
>>> host = 'http://localhost:8998'
>>> data = {'kind': 'spark'}
>>> headers = {'Content-Type': 'application/json'}
>>> r = requests.post(host + '/sessions', data=json.dumps(data), headers=headers)
>>> r.json()
{u'kind': u'spark', u'log': [], u'proxyUser': None, u'appInfo': {u'driverLogUrl': None, u'sparkUiUrl': None}, u'state': u'starting', u'appId': None, u'owner': None, u'id': 2}
>>> session_url = host + r.headers['location']
>>> r = requests.get(session_url, headers=headers)
>>> r.json()
{u'kind': u'spark', u'log': [], u'proxyUser': None, u'appInfo': {u'driverLogUrl': None, u'sparkUiUrl': None}, u'state': u'dead', u'appId': None, u'owner': None, u'id': 2}

Showing state as dead enter image description here

Log file(livy/logs/livy-umesh-server.out) not showing anything about spark session dead

livyserver:~$ cat livy/logs/livy-umesh-server.out
log4j:WARN No appenders could be found for logger (com.cloudera.livy.LivyConf).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Umesh Gaikwad
  • 301
  • 3
  • 14
  • Where exactly are you stuck? Did you download Livy? Look at its configuration files? Run the startup script and get any errors? – OneCricketeer Sep 25 '19 at 04:52
  • @cricket_007 I have edited my question and mention what I did. So please have a look – Umesh Gaikwad Sep 25 '19 at 13:00
  • 1
    You're cloning the wrong project (last commit was years ago, and it moved to the Apache incubator}. You can find how to download and setup Livy here http://livy.incubator.apache.org/get-started/ – OneCricketeer Sep 25 '19 at 13:18
  • thanks @cri livy.spark.master = local livy.spark.deploy-mode = cluster – Umesh Gaikwad Sep 27 '19 at 10:41
  • thanks, @cricket_007 for you quick and kind response it's working well with ``` livy.spark.master = local livy.spark.deploy-mode = cluster ``` But I wanted to submit jobs on yarn of Google Dataproc cluster so where could I find the steps to configure livy? – Umesh Gaikwad Sep 27 '19 at 10:51
  • livy.spark.master needs to be set to your remote YARN resource manager address. – OneCricketeer Sep 27 '19 at 12:43
  • @cricket_007 thanks for the response but actually i am a little bit new to YARN. could you please tell me which address I need to set in livy.spark.master – Umesh Gaikwad Oct 01 '19 at 11:42
  • Should just be the ResourceManager host:port of the Dataproc cluster – OneCricketeer Oct 01 '19 at 14:09
  • Hello @cricket_007, I tried with `livy.spark.master = x.x.x.x:8088` but it throwing error `Master must either be yarn or start with a spark, mesos, k8s, or local` So is there any another way to set a remote YARN resource manager address? – Umesh Gaikwad Oct 03 '19 at 05:56
  • Sorry, you need to set it equal to yarn, then in the Spark conf folder, you would edit the yarn-site.xml file to point at that address. But I would verify if you can submit a regular job outside of Livy first – OneCricketeer Oct 03 '19 at 12:39
  • @umesh were you able to resolve this? – Avik Aggarwal Dec 01 '20 at 08:47
  • @OneCricketeer were you able to confirm if you can submit a regular job outside of Livy? – Avik Aggarwal Dec 01 '20 at 08:56

1 Answers1

1

To run interactive sessions with Livy you need to ensure that you have network connection between Livy and Spark Driver in both directions for the RPC calls between them. If it is your case you will see in Spark Driver logs about the problems to connect to Livy RPC server or callback timeout.

You may also want to enable more verbose logging to see the detailed behaviour of Livy.