0

I am new to scala and Spark. I am trying to read in a csv file therefore I create a SparkSession to read the csv. Also I create a SparkContext to work later with rdd. I am using scala-ide.

The appearing error is maybe a common error in java, but I am not able to solve them.

code:

import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.SparkContext._
import org.apache.spark.mllib.stat.Statistics
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql._



object Solution1 {
  def main(args: Array[String]){

    println("Create contex for rdd ")
    val conf = new SparkConf().setAppName("Problem1")
    val cont = new SparkContext(conf)

    println("create SparkSession and read csv")
    val spark = SparkSession.builder().appName("Problem1").getOrCreate()
    val data = spark.read.option("header", false).csv("file.csv")


    // further processing


   cont.stop()  
  }

} 

The error:

Create contex for rdd 
Exception in thread "main" java.lang.NoClassDefFoundError: org/spark_project/guava/cache/CacheLoader
    at org.apache.spark.SparkConf.loadFromSystemProperties(SparkConf.scala:73)
    at org.apache.spark.SparkConf.<init>(SparkConf.scala:68)
    at org.apache.spark.SparkConf.<init>(SparkConf.scala:55)
    at Solution1$.main(Solution1.scala:13)
    at Solution1.main(Solution1.scala)
Caused by: java.lang.ClassNotFoundException: org.spark_project.guava.cache.CacheLoader
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 5 more
haapoo
  • 109
  • 2
  • 9

1 Answers1

-1

Please create Spark Context like below

def main(args: Array[String]): Unit = {
    val conf = new SparkConf().setAppName("someName").setMaster("local[*]")
    val sparkContext = new SparkContext(conf)
}

To read data

val rdd = sparkContext.textFile("path.csv")

and Spark Session like below

def main(args: Array[String]): Unit = {
    val spark = SparkSession
                .builder()
                .appName("Creating spark session")
                .master("local[*]")
                .getOrCreate()
}

To read data call

val df = spark.read.format("json").load("path.json")

Also if you have spark session create then you do not need to create Spark context separately, you can call Spark session like this way to take advantage of Spark context as well:

val data = spark.sparkContext.textFile("path")
halfer
  • 19,824
  • 17
  • 99
  • 186
Rajnish Kumar
  • 2,828
  • 5
  • 25
  • 39