0

I am trying to add this dependency to the spark 2 interpreter in zeppelin

https://mvnrepository.com/artifact/org.apache.spark/spark-sql_2.11/2.2.0

However, after adding the dependency, I get a null pointer exception when running any code.

Null Pointer Exception Screenshot

Spark & Scala Version Screenshot

Adding Dependency

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
  • [Please do not post images of text in a question](https://meta.stackoverflow.com/questions/303812/discourage-screenshots-of-code-and-or-errors), as these are not searchable. Paste the text itself. Then people can find it to give you an answer. – Brian Tompsett - 汤莱恩 Oct 06 '18 at 17:00

3 Answers3

0

You don't need to add spark-sql, it is already in spark interpreter.

zjffdu
  • 25,496
  • 45
  • 109
  • 159
  • Im trying to run this command: data=fire_services_customDF.withColumn("CallDateTmp",date_format(to_date(col("CallDate"), "MM/dd/yy"), "yyyy-MM-dd")).cast("timestamp")Getting this error: :37: error: not found: value date_format data=fire_services_customDF.withColumn("CallDateTmp",date_format(to_date(col("CallDate"), "MM/dd/yy"), "yyyy-MM-dd")).cast("timestamp") :37: error: not found: value to_date data=fire_services_customDF.withColumn("CallDateTmp",date_format(to_date(col("CallDate"), "MM/dd/yy"), "yyyy-MM-dd")).cast("timestamp") – Mustafa Akmal Oct 08 '18 at 06:58
  • date_format is the function of spark sql. You need to import them explicitly. This error is not due to the missing of spark sql jar – zjffdu Oct 08 '18 at 11:54
0

Just add %spark.sql at the top of your notebook to provide an SQL environment

https://zeppelin.apache.org/docs/0.8.0/interpreter/spark.html#overview

bp2010
  • 2,342
  • 17
  • 34
0

I solved the problem. I was defining a class in Scala. The methods to_date & date_format were being used inside the class but my import statements were outside the class. All I had to do was place the import statements inside the class brackets and it worked fine.

case class HelperClass(){
import org.apache.spark.sql.functions._
import org.apache.spark.sql.types._

var fire_services_customDF = fire_servicesDF
var data=fire_servicesDF

def SetDatatypes() : Unit = {
    data=fire_services_customDF.withColumn("CallDateTmp",date_format(to_date(col("CallDate"), "MM/dd/yy"), "yyyy-MM-dd").cast("timestamp"))
}

def PrintSchema() : Unit= {
         data.printSchema
}

}