Apache Zeppelin is a web-based notebook that enables data-driven interactive data analytics. You can make beautiful data-driven, interactive and collaborative documents with SQL, Python, Scala and more. It also supports Markdown syntax.
Questions tagged [apache-zeppelin]
1460 questions
16
votes
1 answer
Field "features" does not exist. SparkML
I am trying to build a model in Spark ML with Zeppelin.
I am new to this area and would like some help. I think i need to set the correct datatypes to the column and set the first column as the label. Any help would be greatly appreciated, thank…

Young4844
- 247
- 1
- 4
- 12
15
votes
2 answers
How to get the output from console streaming sink in Zeppelin?
I'm struggling to get the console sink working with PySpark Structured Streaming when run from Zeppelin. Basically, I'm not seeing any results printed to the screen, or to any logfiles I've found.
My question: Does anyone have a working example of…

m01
- 9,033
- 6
- 32
- 58
15
votes
2 answers
Keyboard shortcuts for Zeppelin Notebook
There was an old jira for keyboard shortcuts. But there did not appear to be an associated document
https://issues.apache.org/jira/browse/ZEPPELIN-391
Is there a comprehensive cheat-sheet for the shortcuts? Especially to compare to the excellent…

WestCoastProjects
- 58,982
- 91
- 316
- 560
15
votes
1 answer
Using d3.js with Apache Zeppelin
I'm trying to add more visualization options to Apache Zeppelin by integrating it with d3.js
I found an example where someone did it with leaflet.js here, and tried to do something similar -- unfortunately I'm not too familiar with angularJS (what…

user5004049
- 691
- 1
- 8
- 17
15
votes
1 answer
Zeppelin: Scala Dataframe to python
If I have a Scala paragraph with a DataFrame, can I share and use that with python. (As I understand it pyspark uses py4j)
I tried this:
Scala paragraph:
x.printSchema
z.put("xtable", x )
Python paragraph:
%pyspark
import numpy as np
import…

oluies
- 17,694
- 14
- 74
- 117
15
votes
11 answers
Hello world in zeppelin failed
I just installed apache zeppelin (built from latest source from git repo) and successfully saw it is up and running in the port 10008.
I created a new note book with a single line of code
val a = "Hello World!"
And run this paragraph and saw the…

Bala
- 675
- 2
- 7
- 23
14
votes
2 answers
What is apache zeppelin?
As we are hearing often about apache zeppelin, So few questions comes to our mind:
What is Apache zeppelin?
What new and/or extra it is adding to Big data ecosystem?
Is it a replacement of some of the framework(s)/tool(s) already
existing in Big…

Farooque
- 3,616
- 2
- 29
- 41
14
votes
3 answers
Is it possible to customize the skin on Zeppelin?
Is it possible to customize the skin on Zeppelin? In otherwords, replace the Zeppelin logo with something else?

jjreddick
- 285
- 2
- 11
13
votes
1 answer
Convert between spark.SQL DataFrame and pandas DataFrame
Is that possible to
convert from
to
pd.DataFrame
under %pyspark environment ?

Hello lad
- 17,344
- 46
- 127
- 200
13
votes
1 answer
How can I get sql results over 100 in apache zeppelin?
When I execute this query in apache-zeppelin I get only 100 results with 'Results are limited by 100.' message.
%sql
SELECT ip
FROM log
So I appended 'Limit 10000' in SQL query, but it returns only 100 results again.
%sql
SELECT ip
FROM log
LIMIT…

Justin Pyo
- 421
- 1
- 4
- 8
12
votes
2 answers
How do I get independent service Zeppelin to see Hive?
I am using HDP-2.6.0.3 but I need Zeppelin 0.8, so I have installed it as an independent service. When I run:
%sql
show tables
I get nothing back and I get 'table not found' when I run Spark2 SQL commands. Tables can be seen in the 0.7 Zeppelin…

schoon
- 2,858
- 3
- 46
- 78
11
votes
1 answer
Cannot read csv file Apache Zeppelin 0.8
I am currently using Apache Zeppelin 0.8. I tried loading a csv file like this :
val df = spark.read.option("header", "true").option("inferSchema", "true").csv("/path/to/csv/name.csv")
I have also tried this :
val df =…

Skeftical
- 151
- 6
11
votes
6 answers
How can I get Zeppelin to restart cleanly on an EMR cluster?
I am running an EMR cluster and trying to use a Zeppelin notebook for data analysis.
Versions:
Release label:emr-5.2.1
Hadoop distribution: Amazon 2.7.3
Hive 2.1.0
Spark 2.0.2
Zeppelin 0.6.2
I am consistently having problems with Zeppelin hanging…

Andy Jobe
- 311
- 1
- 2
- 8
11
votes
2 answers
Scala and Spark UDF function
I made a simple UDF to convert or extract some values from a time field in a temptabl in spark. I register the function but when I call the function using sql it throws a NullPointerException. Below is my function and process of executing it. I am…

fanbondi
- 1,035
- 5
- 18
- 37
11
votes
6 answers
Reading csv files in zeppelin using spark-csv
I wanna read csv files in Zeppelin and would like to use databricks'
spark-csv package: https://github.com/databricks/spark-csv
In the spark-shell, I can use spark-csv with
spark-shell --packages com.databricks:spark-csv_2.11:1.2.0
But how do I…

fabsta
- 147
- 1
- 4
- 13