Apache Zeppelin is a web-based notebook that enables data-driven interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with SQL, Python, Scala and more. It also supports Markdown syntax.
Questions tagged [apache-zeppelin]
1460 questions
6
votes
2 answers
Select columns that satisfy a condition
I'm running the following notebook in zeppelin:
%spark.pyspark
l = [('user1', 33, 1.0, 'chess'), ('user2', 34, 2.0, 'tenis'), ('user3', None, None, ''), ('user4', None, 4.0, ' '), ('user5', None, 5.0, 'ski')]
df = spark.createDataFrame(l, ['name',…

Sofiane Cherchalli
- 123
- 2
- 8
6
votes
0 answers
SparkSession return nothing with an HiveServer2 connection throught JDBC
I have an issue about reading data from a remote HiveServer2 using JDBC and SparkSession in Apache Zeppelin.
Here is the code.
%spark
import org.apache.spark.sql.Row
import org.apache.spark.sql.SparkSession
val prop = new…

Thomas DUDOUX
- 83
- 1
- 5
6
votes
1 answer
How to expose Spark Driver behind dockerized Apache Zeppelin?
I am currently building a custom docker container from a plain distribution with Apache Zeppelin + Spark 2.x inside.
My Spark jobs will run in a remote cluster and I am using yarn-client as master.
When I run a notebook and try to print sc.version,…

ThR37
- 3,965
- 6
- 35
- 42
6
votes
0 answers
pyspark: org.apache.thrift.transport.TTransportException at ERROR
I'm using Zeppelin Notebooks/Apache Spark and I am frequently getting the following error:
org.apache.thrift.transport.TTransportException
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at…

Derek Jedamski
- 195
- 1
- 9
6
votes
0 answers
Zeppelin : How to fill dynamic form dropdown list from sql interpreter
I have connected my Zeppelin to postgresql using the native driver provided in the Zeppelin package. I have a column - ID (of a table say 'a') which consist of all the ID's required for further processing. …

kumar_m_kiran
- 3,982
- 4
- 47
- 72
6
votes
1 answer
dep interpreter not found
I try to use the dep interpreter in Zeppelin.
I use the %dep declaration within my zeppelin notebook.
However, I end up with the error "dep interpreter not found"
The %dep interpreter is configured within the interpreter section correctly

Stefan Papp
- 2,199
- 1
- 28
- 54
6
votes
2 answers
ClassNotFoundException: org.apache.spark.repl.SparkCommandLine
I am a newbie in Apache Zeppelin and I try to run it locally. I try to run just a simple sanity check to see that sc exists and get the error below.
I compiled it for pyspark and spark 1.5 (I use spark 1.5). I increased the memory to 5 GB and…

Tom Ron
- 5,906
- 3
- 22
- 38
6
votes
4 answers
Zeppelin throws java.lang.OutOfMemoryError: Java heap space
I am trying to use Zeppelin with the following code:
val dataText = sc.parallelize(IOUtils.toString(new URL("http://XXX.XX.XXX.121:8090/my_data.txt"),Charset.forName("utf8")).split("\n"))
case class Data(id: string, time: long, value1: Double,…

Kiran
- 2,997
- 6
- 31
- 62
6
votes
1 answer
How to set up Zeppelin to work with remote EMR Yarn cluster
I have Amazon EMR Hadoop v2.6 cluster with Spark 1.4.1, with Yarn resource manager.
I want to deploy Zeppelin on separate machine to allow turning off EMR cluster when there is no jobs running.
I tried following instruction from here…

snowindy
- 3,117
- 9
- 40
- 54
6
votes
1 answer
How to specify a missing value in a dataframe
I am trying to load a CSV file into a Spark data frame with spark-csv [1] using an Apache Zeppelin notebook and when loading a numeric field that doesn't have value the parser fails for that line and the line gets skipped.
I would have expected the…

Samuel Kerrien
- 6,965
- 2
- 29
- 32
6
votes
8 answers
Apache zeppelin process died
I'm trying to run zeppelin on Ubuntu14 w/ Hadoop 1.0.3 and Spark 1.4.0.
I've finished building the source code, and all of the package successfully finished building. But when I run the daemon, it fails and says that the Zeppelin process had…

Joseph Seung Jae Dollar
- 1,016
- 4
- 13
- 28
6
votes
1 answer
How to install Apache Zeppelin on existing Apache Spark standalone cluster
I have an existing Apache Spark (1.3 version) standalone cluster on AWS and I would like to install Apache Zeppelin.
I have a very simple question, do I have to install Zeppelin on the Spark's master?
If the answer is yes, Could I use that guide…

PistolPete
- 147
- 2
- 10
5
votes
1 answer
Cannot find conda info. Please verify your conda installation on EMR
I am trying to install conda on EMR and below is my bootstrap script, it looks like conda is getting installed but it is not getting added to environment variable. When I manually update the $PATH variable on EMR master node, it can identify conda.…

Explorer
- 1,491
- 4
- 26
- 67
5
votes
1 answer
Zeppelin - LDAP Authentication failed
I am trying to configure ldap authentication in Zeppelin notebook. I have specified ldap server and other configurations by following this link. However, when I try to login I got following error:
ERROR [2019-12-23 17:52:12,196] ({qtp1580893732-66}…

user1584253
- 975
- 2
- 18
- 55
5
votes
5 answers
error when run zepplin connecting aws glue
I following the tutorial steps as show in https://docs.aws.amazon.com/glue/latest/dg/dev-endpoint-tutorial-local-notebook.html
There's no issue connection between local zepplin to AWS Glue. However when I run test command on zepplin it gives me…

conandor
- 3,637
- 6
- 29
- 36