Hello world in zeppelin failed

Question

I just installed apache zeppelin (built from latest source from git repo) and successfully saw it is up and running in the port 10008. I created a new note book with a single line of code

val a = "Hello World!"

And run this paragraph and saw the below error

java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at org.apache.thrift.transport.TSocket.open(TSocket.java:182) at org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:51) at org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:37) at org.apache.commons.pool2.BasePooledObjectFactory.makeObject(BasePooledObjectFactory.java:60) at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861) at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435) at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.getClient(RemoteInterpreterProcess.java:139) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:137) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:257) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:104) at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:197) at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:304) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745

Any clue?

My backend is spark 1.5 and I verified by web interface of interpreter that zeppelin points to right version of spark and approproate spark.home.

1

Is your spark running? – Reactormonk Sep 25 '15 at 08:26

user1314742 · Answer 1 · 2016-02-20T11:13:56.807

The error may be casued also du to an error occurred while Zeppelin trying to create the interpreter.

Zeppelin starts the interpretter in a different process and tries to connect to using Thrift Protocol

In my case I ve this error when trying to assign 5GB for spark driver in spark-defaults.conf It is resolved when commenting this line (or assign 4g or less)

#spark.driver.memory              5g

You could have a look at this JIRA ZEPPELIN-305

EDIT:

This error could be caused by any reason that prevents the Spark interpreter process from starting. Recently, I ve got it when trying to add the JMX options to ZEPPELIN_JAVA_OPTS, that cause the interpreter process to use the same JMX port as the Zeppelin process. Giving the "Port Already in Use" error

Please check the Zeppelin logs (by default they are in ZEPPELIN_DIR/logs/ to see what is happening when Zeppelin trying to start Spark Interpreter

score 3 · Answer 2 · answered Mar 17 '16 at 19:47

3

I had this issue when $SPARK_HOME was not set correctly

answered Mar 17 '16 at 19:47

LK__

6,515
5
34
53

Tagar · Answer 3 · 2017-01-24T17:25:01.530

An error stack like [1] below could mean a lot of different things. Zeppelin Server could not connect to a local interpreter, because it did not start or died. It seems a Zeppelin bug as it can't catch when interpreter.sh exits without creating a Zeppelin interpreter process, submitted https://issues.apache.org/jira/browse/ZEPPELIN-1984 to track that.

In all our cases with different root causes, real error was only revealable if you would add

LOG="/tmp/interpreter.sh-$$.log"
date >> $LOG
set -x
exec >> $LOG
exec 2>&1

to $ZEPPELIN_HOME/bin/interpreter.sh so then a /tmp/interpreter.sh-*.log file would show you actual problem.

[1]

ERROR [2017-01-18 16:54:38,533] ({pool-2-thread-2} NotebookServer.java[afterStatusChange]:1645) - Error org.apache.zeppelin.interpreter.InterpreterException: org.apache.zeppelin.interpreter.InterpreterException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:232) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:400) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:105) at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:316) at org.apache.zeppelin.scheduler.Job.run(Job.java:176) at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:329) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262)

Edit. Another way to reveal true root cause is to change log4j to see the output of spark interpreter process, as hinted by Jeff in ZEPPELIN-1984. Change your ZEPPELIN_HOME/conf/log4j.properies as following:

log4j.rootLogger = INFO, dailyfile

log4j.appender.stdout = org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout = org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%5p [%d] ({%t} %F[%M]:%L) - %m%n

log4j.appender.dailyfile.DatePattern=.yyyy-MM-dd
log4j.appender.dailyfile.Threshold = DEBUG
log4j.appender.dailyfile = org.apache.log4j.DailyRollingFileAppender
log4j.appender.dailyfile.File = ${zeppelin.log.file}
log4j.appender.dailyfile.layout = org.apache.log4j.PatternLayout
log4j.appender.dailyfile.layout.ConversionPattern=%5p [%d] ({%t} %F[%M]:%L) - %m%n

log4j.logger.org.apache.zeppelin.interpreter.InterpreterFactory=DEBUG
log4j.logger.org.apache.zeppelin.notebook.Paragraph=DEBUG
log4j.logger.org.apache.zeppelin.scheduler=DEBUG
log4j.logger.org.apache.zeppelin.livy=DEBUG
log4j.logger.org.apache.zeppelin.flink=DEBUG
log4j.logger.org.apache.zeppelin.spark=DEBUG
log4j.logger.org.apache.zeppelin.python=DEBUG
log4j.logger.org.apache.zeppelin.interpreter.util=DEBUG
log4j.logger.org.apache.zeppelin.interpreter.remote=DEBUG
log4j.logger.org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer=DEBUG

and restart Zeppelin. Note: it may produce excessive logging. My original advise of adding a few lines to interpreter.sh doesn't require restarting Zeppelin.

Also created pull request to (partially) fix this issue: https://github.com/apache/zeppelin/pull/1921

Update 1/24/2017. https://issues.apache.org/jira/browse/ZEPPELIN-1984 is fixed in master and will be invluded in Zeppelin 0.8 release. Two important fixes are part of ZEPPELIN-1984:

you wouldn't get "connection refused" whan an interpter process can't start;
Zeppelin would show root cause (in a paragraph output) what is the root cause.

Thomas Decaux · Answer 4 · 2020-03-02T09:04:47.373

Problem

Zeppelin will run a custom Spark application on localhost, sometimes (if you have multiple networks, such as a VPN), it can not use 127.0.0.1 :

Because this source code: https://github.com/apache/zeppelin/blob/v0.8.1/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterUtils.java#L104

public static String findAvailableHostAddress() throws UnknownHostException, SocketException {
    InetAddress address = InetAddress.getLocalHost();
    if (address.isLoopbackAddress()) {
      for (NetworkInterface networkInterface : Collections
          .list(NetworkInterface.getNetworkInterfaces())) {
        if (!networkInterface.isLoopback()) {
          for (InterfaceAddress interfaceAddress : networkInterface.getInterfaceAddresses()) {
            InetAddress a = interfaceAddress.getAddress();
            if (a instanceof Inet4Address) {
              return a.getHostAddress();
            }
          }
        }
      }
    }
    return address.getHostAddress();
  }

You can see Spark interpreter is running and listening on "weird" IP:

ps aux | grep spark
zep/bin/interpreter.sh -d zep/interpreter/spark -c 10.100.37.2 -p 50778 -r : -l /zep/local-repo/spark -g spark

But, the Zeppelin UI try to connect to localhost, it will resolve 127.0.0.1 thus the Connection refused.

Solution

Disconnect from VPN before running Spark interpreter
Use v0.8.2 which fix it by a new env variable ZEPPELIN_LOCAL_IP

score 0 · Answer 5 · answered Sep 25 '15 at 12:15

0

I noticed that the url that points to spark was not correct. Once, I corrected it, it works fine. Thanks anyway.

answered Sep 25 '15 at 12:15

Bala

675
2
7
23

score 0 · Answer 6 · answered Jun 03 '16 at 04:00

0

Had same issue when $YARN_QUEUE was set incorrectly

answered Jun 03 '16 at 04:00

thepolina

1,244
1
14
28

It was a long time ago... Probably set it to different queue name? – thepolina Feb 29 '20 at 03:22

score 0 · Answer 7 · answered Oct 05 '16 at 08:04

This question has been open for a year now, not sure if the solution to the problem was realized. Recently, I bumped into a similar error using Yarn-Spark on Amazon EMR. As I debugged it, I realized the following, and would suggest people to try if they find themselves in similar shoes(solution is based on EMR, but should be similar on other offerings)

1. kill -9 `ps -ef | grep zeppelin | grep -v grep | awk '{print $2}'`( *will make sure zombie processes are taken care of*)
2. kill -9 `ps -ef | grep hadoop-yarn-resourcemanager | grep -v grep | awk '{print $2}'`
3. sudo /sbin/restart hadoop-yarn-resourcemanager
4. At times, simply starting the resource-manager does not start the name-node `sudo start hadoop-hdfs-namenode`
5. sudo /usr/lib/zeppelin/bin/zeppelin-daemon.sh start 
6. Use telnet to make sure that the default ports are open for required service.

At the endo the same, one should be able to get zeppelin running properly with a valid SparkContext. Hope this was useful

This way you will run Zeppelin under the root user, which is a huge security issue. Imagine your user using %sh and then running arbitrary commands as the root user. — Alan Kis, Jun 13 '18 at 12:46

christopherbalz · Answer 8 · 2017-01-10T18:05:07.007

0

In my case, (project-root)/node_modules/zeppelin/spark-2.0.2-bin-hadoop2.7 was not installed, for some unknown reason. rm -rf node_modules; npm cache clear; npm i fixed it.

edited Jan 10 '17 at 18:05

answered Dec 15 '16 at 22:19

christopherbalz

722
1
8
22

score 0 · Answer 9 · answered Sep 08 '17 at 09:27

0

I fixed this bug with change the spark-modle yarn-cluster to yarn-client as it seted in zepplin/conf/defalt.sh

answered Sep 08 '17 at 09:27

Jary zhen

437
6
18

score 0 · Answer 10 · answered Oct 06 '17 at 21:44

I got exactly the same error when tried to run Zeppelin with Spark in same docker container on micro instance in Amazon ECS.

The error source is visible in output log in %ZEPPELIN_HOME%/logs/*.out and it was saying that Zeppelin failed to start Spark interpreter due low memory. So I moved my Docker image to the instance with more memory.

score -1 · Answer 11 · answered Feb 21 '18 at 14:46

-1

In my case, I have three node in my cluster. Although in three of them spark was installed, zeppelin was installed on only one of them.

So In zeppelin Interpreter Menu --> Spark --> Edit --> Properties --> Master

changing that parameter from yarn-client to local[*] fixed my problem.

answered Feb 21 '18 at 14:46

neverwinter

810
2
15
42

I am trying to set up in windows 8 r2. I had already set up SPARK_HOME and able to run the spark-shell from windows cmd....now I installed zeppelin 0.8.0 version ....when i say spark.version it throws null pointer exception ...any clue how to set it up on windows ? how to check if it is pointing to the spark_home ? – BdEngineer Jan 22 '19 at 08:16

Hello world in zeppelin failed

11 Answers11

Problem

Solution

Linked