1

I'm using spark version 2.2.1 and Carbondata 1.5.3 version. Following the instructions in Carbondata official guide, I can run the import statements, import org.apache.spark.sql.SparkSession import org.apache.spark.sql.CarbonSession._

But failing on next step with below log:

scala> val carbon = SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("<carbon_store_path>")
19/05/08 12:14:17 WARN SparkContext: Using an existing SparkContext; some configuration may not take effect.
19/05/08 12:14:17 WARN CarbonProperties: The enable unsafe sort value "null" is invalid. Using the default value "true
19/05/08 12:14:17 WARN CarbonProperties: The enable off heap sort value "null" is invalid. Using the default value "true
19/05/08 12:14:17 WARN CarbonProperties: The custom block distribution value "null" is invalid. Using the default value "false
19/05/08 12:14:17 WARN CarbonProperties: The enable vector reader value "null" is invalid. Using the default value "true
19/05/08 12:14:17 WARN CarbonProperties: The carbon task distribution value "null" is invalid. Using the default value "block
19/05/08 12:14:17 WARN CarbonProperties: The enable auto handoff value "null" is invalid. Using the default value "true
19/05/08 12:14:17 WARN CarbonProperties: The specified value for property 512is invalid.
19/05/08 12:14:17 WARN CarbonProperties: The specified value for property carbon.sort.storage.inmemory.size.inmbis invalid. Taking the default value.512
java.lang.ClassNotFoundException: org.apache.spark.sql.hive.CarbonSessionStateBuilder
  at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  at java.lang.Class.forName0(Native Method)
  at java.lang.Class.forName(Class.java:348)
  at org.apache.spark.util.Utils$.classForName(Utils.scala:230)
  at org.apache.spark.util.CarbonReflectionUtils$.createObject(CarbonReflectionUtils.scala:324)
  at org.apache.spark.util.CarbonReflectionUtils$.getSessionState(CarbonReflectionUtils.scala:220)
  at org.apache.spark.sql.CarbonSession.sessionState$lzycompute(CarbonSession.scala:57)
  at org.apache.spark.sql.CarbonSession.sessionState(CarbonSession.scala:56)
  at org.apache.spark.sql.CarbonSession$CarbonBuilder$$anonfun$getOrCreateCarbonSession$2.apply(CarbonSession.scala:260)
  at org.apache.spark.sql.CarbonSession$CarbonBuilder$$anonfun$getOrCreateCarbonSession$2.apply(CarbonSession.scala:260)
  at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
  at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
  at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
  at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
  at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
  at org.apache.spark.sql.CarbonSession$CarbonBuilder.getOrCreateCarbonSession(CarbonSession.scala:260)
  at org.apache.spark.sql.CarbonSession$CarbonBuilder.getOrCreateCarbonSession(CarbonSession.scala:169)
  ... 50 elided

My JAVA_HOME is set, otherwise spark wouldn't run, I can run Spark applications without CarbonData. It fails when trying to create carbonsession. SPARK_HOME points to Spark's bin directory.

I'm running Spark on local machine and using local filesystem for storage without hive.

Any help is appreciated. Please let me know if any other details are needed.

appleboy
  • 661
  • 1
  • 9
  • 15

1 Answers1

0

Make sure that your carbondata jar fits your spark version. I had that mistake at first any ways

user3027497
  • 117
  • 1
  • 7