I am new to scala and Spark. I am trying to read in a csv file therefore I create a SparkSession to read the csv. Also I create a SparkContext to work later with rdd. I am using scala-ide.
The appearing error is maybe a common error in java, but I am not able to solve them.
code:
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.SparkContext._
import org.apache.spark.mllib.stat.Statistics
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql._
object Solution1 {
def main(args: Array[String]){
println("Create contex for rdd ")
val conf = new SparkConf().setAppName("Problem1")
val cont = new SparkContext(conf)
println("create SparkSession and read csv")
val spark = SparkSession.builder().appName("Problem1").getOrCreate()
val data = spark.read.option("header", false).csv("file.csv")
// further processing
cont.stop()
}
}
The error:
Create contex for rdd
Exception in thread "main" java.lang.NoClassDefFoundError: org/spark_project/guava/cache/CacheLoader
at org.apache.spark.SparkConf.loadFromSystemProperties(SparkConf.scala:73)
at org.apache.spark.SparkConf.<init>(SparkConf.scala:68)
at org.apache.spark.SparkConf.<init>(SparkConf.scala:55)
at Solution1$.main(Solution1.scala:13)
at Solution1.main(Solution1.scala)
Caused by: java.lang.ClassNotFoundException: org.spark_project.guava.cache.CacheLoader
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 5 more