0

Hi i am trying to set up a small spark application in SBT,

My build.sbt is

import Dependencies._

name := "hello"

version := "1.0"

scalaVersion := "2.11.8"

val sparkVersion = "1.6.1"

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % sparkVersion,
  "org.apache.spark" %% "spark-streaming" % sparkVersion,
  "org.apache.spark" %% "spark-streaming-twitter" % sparkVersion
)

libraryDependencies += scalaTest % Test

Everything works fine i get all dependencies resolved by SBT, but when i try importing spark in my hello.scala project file i get this error not found: value spark

my hello.scala file is

package example
import org.apache.spark._
import org.apache.spark.SparkContext._

object Hello extends fileImport with App {
  println(greeting)
  anime.select("*").orderBy($"rating".desc).limit(10).show()
}

trait fileImport {
  lazy val greeting: String = "hello"
  var anime = spark.read.option("header", true).csv("C:/anime.csv")
  var ratings = spark.read.option("header", true).csv("C:/rating.csv")
}

here is error file i get

[info] Compiling 1 Scala source to C:\Users\haftab\Downloads\sbt-0.13.16\sbt\alfutaim\target\scala-2.11\classes...
[error] C:\Users\haftab\Downloads\sbt-0.13.16\sbt\alfutaim\src\main\scala\example\Hello.scala:12: not found: value spark
[error]   var anime = spark.read.option("header", true).csv("C:/anime.csv")
[error]               ^
[error] C:\Users\haftab\Downloads\sbt-0.13.16\sbt\alfutaim\src\main\scala\example\Hello.scala:13: not found: value spark
[error]   var ratings = spark.read.option("header", true).csv("C:/rating.csv")
[error]                 ^
[error] two errors found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 3 s, completed Sep 10, 2017 1:44:47 PM
Hassan Aftab
  • 49
  • 2
  • 5

1 Answers1

2

spark is initialized in spark-shell only

but for the code you need to initialize the spark variable by yourself

import org.apache.spark.sql.SparkSession
val spark = SparkSession.builder().appName("testings").master("local").getOrCreate

you can change the testings name to your desired name .master option is optional if you want to run the code using spark-submit

Ramesh Maharjan
  • 41,071
  • 6
  • 69
  • 97
  • Thanks Ramesh, One more thing to add here is that SparkSession only works with spark 2.0.0 so my build.sbt is updated now `import Dependencies._ name := "hello" version := "1.0" scalaVersion := "2.11.8" val sparkVersion = "2.0.0" libraryDependencies ++= Seq( "org.apache.spark" %% "spark-core" % sparkVersion, "org.apache.spark" %% "spark-streaming" % sparkVersion, "org.apache.spark" %% "spark-sql" % sparkVersion ) libraryDependencies += scalaTest % Test` [https://stackoverflow.com/questions/37337461/what-is-version-library-spark-supported-sparksession] – Hassan Aftab Sep 10 '17 at 10:13
  • yes thats right . for older versions you need to create sqlContext – Ramesh Maharjan Sep 10 '17 at 10:15
  • Ramesh there is one more thing i am stuck in right now i have tried using online help, **value $ is not a member of StringContext** – Hassan Aftab Sep 10 '17 at 10:33
  • `import org.apache.spark.{SparkConf, SparkContext} import org.apache.spark.sql.{SparkSession, SQLContext} import org.apache.spark.sql._ object Hello { val spark = SparkSession.builder().master("local").appName("TV Series Analysis").getOrCreate() import spark.sqlContext.implicits._ import spark.implicits._ var anime = spark.read.option("header", true).csv("C:/anime.csv") anime.select("*").orderBy($"rating".desc).limit(10).show() }` – Hassan Aftab Sep 10 '17 at 10:36
  • 1
    thanks i am done here is new code file `val spark = SparkSession.builder().master("local").appName("TV Series Analysis").config("spark.sql.warehouse.dir","file:///c:/tmp/spark-warehouse").getOrCreate() import spark.implicits._ var anime = spark.read.option("header", true).csv("C:/anime.csv") var ratings = spark.read.option("header", true).csv("C:/rating.csv") anime.select("*").orderBy($"rating".desc).limit(10).show()` – Hassan Aftab Sep 10 '17 at 11:12
  • glad to hear that you sorted out the problem you had. yes $ would be identified when you use import sqlContext.implicits._. If the answer helped then please accept and upvote :) – Ramesh Maharjan Sep 10 '17 at 14:11