Questions tagged [apache-zeppelin]

Apache Zeppelin is a web-based notebook that enables data-driven interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with SQL, Python, Scala and more. It also supports Markdown syntax.

Apache Zeppelin home page

1460 questions
6
votes
2 answers

Select columns that satisfy a condition

I'm running the following notebook in zeppelin: %spark.pyspark l = [('user1', 33, 1.0, 'chess'), ('user2', 34, 2.0, 'tenis'), ('user3', None, None, ''), ('user4', None, 4.0, ' '), ('user5', None, 5.0, 'ski')] df = spark.createDataFrame(l, ['name',…
6
votes
0 answers

SparkSession return nothing with an HiveServer2 connection throught JDBC

I have an issue about reading data from a remote HiveServer2 using JDBC and SparkSession in Apache Zeppelin. Here is the code. %spark import org.apache.spark.sql.Row import org.apache.spark.sql.SparkSession val prop = new…
Thomas DUDOUX
  • 83
  • 1
  • 5
6
votes
1 answer

How to expose Spark Driver behind dockerized Apache Zeppelin?

I am currently building a custom docker container from a plain distribution with Apache Zeppelin + Spark 2.x inside. My Spark jobs will run in a remote cluster and I am using yarn-client as master. When I run a notebook and try to print sc.version,…
ThR37
  • 3,965
  • 6
  • 35
  • 42
6
votes
0 answers

pyspark: org.apache.thrift.transport.TTransportException at ERROR

I'm using Zeppelin Notebooks/Apache Spark and I am frequently getting the following error: org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at…
6
votes
0 answers

Zeppelin : How to fill dynamic form dropdown list from sql interpreter

I have connected my Zeppelin to postgresql using the native driver provided in the Zeppelin package. I have a column - ID (of a table say 'a') which consist of all the ID's required for further processing. …
kumar_m_kiran
  • 3,982
  • 4
  • 47
  • 72
6
votes
1 answer

dep interpreter not found

I try to use the dep interpreter in Zeppelin. I use the %dep declaration within my zeppelin notebook. However, I end up with the error "dep interpreter not found" The %dep interpreter is configured within the interpreter section correctly
Stefan Papp
  • 2,199
  • 1
  • 28
  • 54
6
votes
2 answers

ClassNotFoundException: org.apache.spark.repl.SparkCommandLine

I am a newbie in Apache Zeppelin and I try to run it locally. I try to run just a simple sanity check to see that sc exists and get the error below. I compiled it for pyspark and spark 1.5 (I use spark 1.5). I increased the memory to 5 GB and…
Tom Ron
  • 5,906
  • 3
  • 22
  • 38
6
votes
4 answers

Zeppelin throws java.lang.OutOfMemoryError: Java heap space

I am trying to use Zeppelin with the following code: val dataText = sc.parallelize(IOUtils.toString(new URL("http://XXX.XX.XXX.121:8090/my_data.txt"),Charset.forName("utf8")).split("\n")) case class Data(id: string, time: long, value1: Double,…
Kiran
  • 2,997
  • 6
  • 31
  • 62
6
votes
1 answer

How to set up Zeppelin to work with remote EMR Yarn cluster

I have Amazon EMR Hadoop v2.6 cluster with Spark 1.4.1, with Yarn resource manager. I want to deploy Zeppelin on separate machine to allow turning off EMR cluster when there is no jobs running. I tried following instruction from here…
snowindy
  • 3,117
  • 9
  • 40
  • 54
6
votes
1 answer

How to specify a missing value in a dataframe

I am trying to load a CSV file into a Spark data frame with spark-csv [1] using an Apache Zeppelin notebook and when loading a numeric field that doesn't have value the parser fails for that line and the line gets skipped. I would have expected the…
Samuel Kerrien
  • 6,965
  • 2
  • 29
  • 32
6
votes
8 answers

Apache zeppelin process died

I'm trying to run zeppelin on Ubuntu14 w/ Hadoop 1.0.3 and Spark 1.4.0. I've finished building the source code, and all of the package successfully finished building. But when I run the daemon, it fails and says that the Zeppelin process had…
Joseph Seung Jae Dollar
  • 1,016
  • 4
  • 13
  • 28
6
votes
1 answer

How to install Apache Zeppelin on existing Apache Spark standalone cluster

I have an existing Apache Spark (1.3 version) standalone cluster on AWS and I would like to install Apache Zeppelin. I have a very simple question, do I have to install Zeppelin on the Spark's master? If the answer is yes, Could I use that guide…
5
votes
1 answer

Cannot find conda info. Please verify your conda installation on EMR

I am trying to install conda on EMR and below is my bootstrap script, it looks like conda is getting installed but it is not getting added to environment variable. When I manually update the $PATH variable on EMR master node, it can identify conda.…
Explorer
  • 1,491
  • 4
  • 26
  • 67
5
votes
1 answer

Zeppelin - LDAP Authentication failed

I am trying to configure ldap authentication in Zeppelin notebook. I have specified ldap server and other configurations by following this link. However, when I try to login I got following error: ERROR [2019-12-23 17:52:12,196] ({qtp1580893732-66}…
user1584253
  • 975
  • 2
  • 18
  • 55
5
votes
5 answers

error when run zepplin connecting aws glue

I following the tutorial steps as show in https://docs.aws.amazon.com/glue/latest/dg/dev-endpoint-tutorial-local-notebook.html There's no issue connection between local zepplin to AWS Glue. However when I run test command on zepplin it gives me…