15

I just installed apache zeppelin (built from latest source from git repo) and successfully saw it is up and running in the port 10008. I created a new note book with a single line of code

val a = "Hello World!"

And run this paragraph and saw the below error

java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at org.apache.thrift.transport.TSocket.open(TSocket.java:182) at org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:51) at org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:37) at org.apache.commons.pool2.BasePooledObjectFactory.makeObject(BasePooledObjectFactory.java:60) at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861) at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435) at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.getClient(RemoteInterpreterProcess.java:139) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:137) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:257) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:104) at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:197) at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:304) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745

Any clue?

My backend is spark 1.5 and I verified by web interface of interpreter that zeppelin points to right version of spark and approproate spark.home.

sag
  • 5,333
  • 8
  • 54
  • 91
Bala
  • 675
  • 2
  • 7
  • 23

11 Answers11

6

The error may be casued also du to an error occurred while Zeppelin trying to create the interpreter.

Zeppelin starts the interpretter in a different process and tries to connect to using Thrift Protocol

In my case I ve this error when trying to assign 5GB for spark driver in spark-defaults.conf It is resolved when commenting this line (or assign 4g or less)

#spark.driver.memory              5g

You could have a look at this JIRA ZEPPELIN-305

EDIT:

This error could be caused by any reason that prevents the Spark interpreter process from starting. Recently, I ve got it when trying to add the JMX options to ZEPPELIN_JAVA_OPTS, that cause the interpreter process to use the same JMX port as the Zeppelin process. Giving the "Port Already in Use" error

Please check the Zeppelin logs (by default they are in ZEPPELIN_DIR/logs/ to see what is happening when Zeppelin trying to start Spark Interpreter

user1314742
  • 2,865
  • 3
  • 28
  • 34
3

I had this issue when $SPARK_HOME was not set correctly

LK__
  • 6,515
  • 5
  • 34
  • 53
3

An error stack like [1] below could mean a lot of different things. Zeppelin Server could not connect to a local interpreter, because it did not start or died. It seems a Zeppelin bug as it can't catch when interpreter.sh exits without creating a Zeppelin interpreter process, submitted https://issues.apache.org/jira/browse/ZEPPELIN-1984 to track that.

In all our cases with different root causes, real error was only revealable if you would add

LOG="/tmp/interpreter.sh-$$.log"
date >> $LOG
set -x
exec >> $LOG
exec 2>&1

to $ZEPPELIN_HOME/bin/interpreter.sh so then a /tmp/interpreter.sh-*.log file would show you actual problem.

[1]

ERROR [2017-01-18 16:54:38,533] ({pool-2-thread-2} NotebookServer.java[afterStatusChange]:1645) - Error org.apache.zeppelin.interpreter.InterpreterException: org.apache.zeppelin.interpreter.InterpreterException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:232) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:400) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:105) at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:316) at org.apache.zeppelin.scheduler.Job.run(Job.java:176) at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:329) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262)

Edit. Another way to reveal true root cause is to change log4j to see the output of spark interpreter process, as hinted by Jeff in ZEPPELIN-1984. Change your ZEPPELIN_HOME/conf/log4j.properies as following:

log4j.rootLogger = INFO, dailyfile

log4j.appender.stdout = org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout = org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%5p [%d] ({%t} %F[%M]:%L) - %m%n

log4j.appender.dailyfile.DatePattern=.yyyy-MM-dd
log4j.appender.dailyfile.Threshold = DEBUG
log4j.appender.dailyfile = org.apache.log4j.DailyRollingFileAppender
log4j.appender.dailyfile.File = ${zeppelin.log.file}
log4j.appender.dailyfile.layout = org.apache.log4j.PatternLayout
log4j.appender.dailyfile.layout.ConversionPattern=%5p [%d] ({%t} %F[%M]:%L) - %m%n

log4j.logger.org.apache.zeppelin.interpreter.InterpreterFactory=DEBUG
log4j.logger.org.apache.zeppelin.notebook.Paragraph=DEBUG
log4j.logger.org.apache.zeppelin.scheduler=DEBUG
log4j.logger.org.apache.zeppelin.livy=DEBUG
log4j.logger.org.apache.zeppelin.flink=DEBUG
log4j.logger.org.apache.zeppelin.spark=DEBUG
log4j.logger.org.apache.zeppelin.python=DEBUG
log4j.logger.org.apache.zeppelin.interpreter.util=DEBUG
log4j.logger.org.apache.zeppelin.interpreter.remote=DEBUG
log4j.logger.org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer=DEBUG

and restart Zeppelin. Note: it may produce excessive logging. My original advise of adding a few lines to interpreter.sh doesn't require restarting Zeppelin.

Also created pull request to (partially) fix this issue: https://github.com/apache/zeppelin/pull/1921

Update 1/24/2017. https://issues.apache.org/jira/browse/ZEPPELIN-1984 is fixed in master and will be invluded in Zeppelin 0.8 release. Two important fixes are part of ZEPPELIN-1984:

  • you wouldn't get "connection refused" whan an interpter process can't start;
  • Zeppelin would show root cause (in a paragraph output) what is the root cause.
Tagar
  • 13,911
  • 6
  • 95
  • 110
1

Problem

Zeppelin will run a custom Spark application on localhost, sometimes (if you have multiple networks, such as a VPN), it can not use 127.0.0.1 :

Because this source code: https://github.com/apache/zeppelin/blob/v0.8.1/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/RemoteInterpreterUtils.java#L104

public static String findAvailableHostAddress() throws UnknownHostException, SocketException {
    InetAddress address = InetAddress.getLocalHost();
    if (address.isLoopbackAddress()) {
      for (NetworkInterface networkInterface : Collections
          .list(NetworkInterface.getNetworkInterfaces())) {
        if (!networkInterface.isLoopback()) {
          for (InterfaceAddress interfaceAddress : networkInterface.getInterfaceAddresses()) {
            InetAddress a = interfaceAddress.getAddress();
            if (a instanceof Inet4Address) {
              return a.getHostAddress();
            }
          }
        }
      }
    }
    return address.getHostAddress();
  }

You can see Spark interpreter is running and listening on "weird" IP:

ps aux | grep spark
zep/bin/interpreter.sh -d zep/interpreter/spark -c 10.100.37.2 -p 50778 -r : -l /zep/local-repo/spark -g spark

But, the Zeppelin UI try to connect to localhost, it will resolve 127.0.0.1 thus the Connection refused.

Solution

  • Disconnect from VPN before running Spark interpreter
  • Use v0.8.2 which fix it by a new env variable ZEPPELIN_LOCAL_IP
Thomas Decaux
  • 21,738
  • 2
  • 113
  • 124
0

I noticed that the url that points to spark was not correct. Once, I corrected it, it works fine. Thanks anyway.

Bala
  • 675
  • 2
  • 7
  • 23
0

Had same issue when $YARN_QUEUE was set incorrectly

thepolina
  • 1,244
  • 1
  • 14
  • 28
0

This question has been open for a year now, not sure if the solution to the problem was realized. Recently, I bumped into a similar error using Yarn-Spark on Amazon EMR. As I debugged it, I realized the following, and would suggest people to try if they find themselves in similar shoes(solution is based on EMR, but should be similar on other offerings)

1. kill -9 `ps -ef | grep zeppelin | grep -v grep | awk '{print $2}'`( *will make sure zombie processes are taken care of*)
2. kill -9 `ps -ef | grep hadoop-yarn-resourcemanager | grep -v grep | awk '{print $2}'`
3. sudo /sbin/restart hadoop-yarn-resourcemanager
4. At times, simply starting the resource-manager does not start the name-node `sudo start hadoop-hdfs-namenode`
5. sudo /usr/lib/zeppelin/bin/zeppelin-daemon.sh start 
6. Use telnet to make sure that the default ports are open for required service.

At the endo the same, one should be able to get zeppelin running properly with a valid SparkContext. Hope this was useful

Pramit
  • 1,373
  • 1
  • 18
  • 27
  • This way you will run Zeppelin under the root user, which is a huge security issue. Imagine your user using %sh and then running arbitrary commands as the root user. – Alan Kis Jun 13 '18 at 12:46
0

In my case, (project-root)/node_modules/zeppelin/spark-2.0.2-bin-hadoop2.7 was not installed, for some unknown reason. rm -rf node_modules; npm cache clear; npm i fixed it.

christopherbalz
  • 722
  • 1
  • 8
  • 22
0

I fixed this bug with change the spark-modle yarn-cluster to yarn-client as it seted in zepplin/conf/defalt.sh

Jary zhen
  • 437
  • 6
  • 18
0

I got exactly the same error when tried to run Zeppelin with Spark in same docker container on micro instance in Amazon ECS.

The error source is visible in output log in %ZEPPELIN_HOME%/logs/*.out and it was saying that Zeppelin failed to start Spark interpreter due low memory. So I moved my Docker image to the instance with more memory.

Igor Bljahhin
  • 939
  • 13
  • 23
-1

In my case, I have three node in my cluster. Although in three of them spark was installed, zeppelin was installed on only one of them.

So In zeppelin Interpreter Menu --> Spark --> Edit --> Properties --> Master

changing that parameter from yarn-client to local[*] fixed my problem.

neverwinter
  • 810
  • 2
  • 15
  • 42
  • I am trying to set up in windows 8 r2. I had already set up SPARK_HOME and able to run the spark-shell from windows cmd....now I installed zeppelin 0.8.0 version ....when i say spark.version it throws null pointer exception ...any clue how to set it up on windows ? how to check if it is pointing to the spark_home ? – BdEngineer Jan 22 '19 at 08:16