0

I installed Zeppelin on Windows using this tutorial and this. I also installed java 8 to avoid problems.

I'm now able to start the Zeppelin server, and I'm trying to run this code -

%pyspark
a=5*4
print("value = %i" % (a))
sc.version

I'm getting this error, related to py4j. I had other problems with this library before (same as here), and to avoid them I replaced the library of py4j in the Zeppelin and Spark on my computer with the latest version- py4j 0.10.7.

This is the error I get-

Traceback (most recent call last):
  File "C:\Users\SHIRM~1.ARG\AppData\Local\Temp\zeppelin_pyspark-1240802621138907911.py", line 309, in <module>
    sc = _zsc_ = SparkContext(jsc=jsc, gateway=gateway, conf=conf)
  File "C:\Users\SHIRM.ARGUS\spark-2.3.2\spark-2.3.2-bin-hadoop2.7\python\pyspark\context.py", line 118, in __init__
    conf, jsc, profiler_cls)
  File "C:\Users\SHIRM.ARGUS\spark-2.3.2\spark-2.3.2-bin-hadoop2.7\python\pyspark\context.py", line 189, in _do_init
    self._javaAccumulator = self._jvm.PythonAccumulatorV2(host, port, auth_token)
  File "C:\Users\SHIRM.ARGUS\Documents\zeppelin-0.8.0-bin-all\interpreter\spark\pyspark\py4j-0.10.7-src.zip\py4j\java_gateway.py", line 1525, in __call__
  File "C:\Users\SHIRM.ARGUS\Documents\zeppelin-0.8.0-bin-all\interpreter\spark\pyspark\py4j-0.10.7-src.zip\py4j\protocol.py", line 332, in get_return_value
py4j.protocol.Py4JError: An error occurred while calling None.org.apache.spark.api.python.PythonAccumulatorV2. Trace:

I googled it, but couldn't find anyone that it had happened to.

Does anyone have an idea how can I solve this?

Thanks

Shir
  • 1,157
  • 13
  • 35

2 Answers2

1

I feel you have installed Java 9 or 10. Uninstall either of those versions and install a fresh copy of Java 8 from here: https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

And set JAVA_HOME inside hadoop_env.cmd (open with any text-editor).

Note: Java 8 or 7 are stable versions to use and uninstall any existing versions of JAVA. Make sure you add JDK (not JRE) in JAVA_HOME.

pvy4917
  • 1,768
  • 17
  • 23
  • I have already installed this exact version, and changed the JAVA_HOME environment to `C:\Program Files\Java\jre1.8.0_181`, but I also have Java 10 installed. Should it be removed? Why? – Shir Oct 04 '18 at 14:16
  • Why do you want to use multiple versions of Java? And you need to set the path to JDK not JRE. – pvy4917 Oct 04 '18 at 14:28
  • Uninstalled it and changed to JDK, and it's still not working :( Another thing- when starting the server, the log gets stuck for a really long time on `Server.java[doStart]:327) - jetty-9.2.15.v20160210` . Maybe it's somehow related?... Any other ideas if not? Thanks – Shir Oct 04 '18 at 14:44
  • Why don't you setup Hortonworks sandbox? It has Zeppelin too. https://hortonworks.com/products/sandbox/ – pvy4917 Oct 04 '18 at 14:56
  • What version of Zeppelin you have? I see it 0.6.2 – pvy4917 Oct 04 '18 at 14:57
  • Can you trying using 0.7.0 – pvy4917 Oct 04 '18 at 16:37
  • Easiest solution is to use any Cloud distributions like hortonworks. – pvy4917 Oct 04 '18 at 17:26
  • Hey, can you just do print(sc) – pvy4917 Oct 04 '18 at 17:29
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/181300/discussion-between-shir-and-prazy). – Shir Oct 04 '18 at 17:33
1

I faced the same problem today, and I fixed it by adding PYTHONPATH in the system environment like:
%SPARK_HOME%\python\lib\py4j;%SPARK_HOME%\python\lib\pyspark

Clock Slave
  • 7,627
  • 15
  • 68
  • 109
Littlefish
  • 81
  • 3