1

In bluemix spark I want to use HiveContext

HqlContext = HiveContext(sc)
//some code
 df = HqlContext.read.parquet("swift://notebook.spark/file.parquet")

I get following error

Py4JJavaError: An error occurred while calling o45.parquet. : java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

YAKOVM
  • 9,805
  • 31
  • 116
  • 217

1 Answers1

1

The HiveContext is not included by default in the Bluemix Spark offering.

To include it in your notebook, you should be able to use %AddJar to load it from a publicly accessible server, e.g.:

%AddJar http://my.server.com/jars/spark-hive_2.10-1.5.2.jar

You can also point this at Maven's repository link:

%AddJar http://repo1.maven.org/maven2/org/apache/spark/spark-hive_2.10/1.5.2/spark-hive_2.10-1.5.2.jar
val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)

This works to enable the Hive Context for me.

Now, it's worth noting that the latest available versions on Maven probably don't line up with the current version of Spark running on Bluemix, so my suggestion is to check the version of Spark on Bluemix by using:

sc.version

Then match the version of this JAR to that version of Spark.