0

In some Spark codes, I have seen that programmers use such code to create SparkContext

 SparkSession session = SparkSession
      .builder()
      .appName("Spark Hive Example")
      .config("spark.sql.warehouse.dir", warehouseLocation)
      .enableHiveSupport()
      .getOrCreate();

But I have always used such kind of code to create JavaSparkContext.

SparkConf sparkConf = new SparkConf().setAppName("Simple App").setMaster("local");
JavaSparkContext spark = new JavaSparkContext(sparkConf);

From the latter part of the code, is there any way I could get a Hive Context to perform operations on Hive Tables?

Thanks!

Vinay Limbare
  • 151
  • 2
  • 16
  • Also with SparkSession, I cannot use the parallelize() method. Any alternative? I don't seem to understand when to use SparkSession and JavaSparkContext. The Java programming guide by Apache Spark uses both as per their need. http://spark.apache.org/docs/latest/rdd-programming-guide.html – Vinay Limbare Oct 19 '17 at 18:40

2 Answers2

0

You are using Spark 2.0 or later which doesn't use SQLContext anymore. SparkSession with enableHiveSupport is a sufficient replacement.

So all you have to do is session instance you already have.

  • Thanks! I am currently using Spark 2.2.0 and am new to it. Do JavaSparkContext has an advantage over SparkSession? Or JavaSparkContext is going to get depreciated sometime in future? – Vinay Limbare Oct 18 '17 at 18:19
0

Finally found the solution.

SparkSession spark = SparkSession
                    .builder()
                    .appName("SampleApp")
                    .master("local")
                    .enableHiveSupport()
                    .getOrCreate();

JavaSparkContext jsc = new JavaSparkContext(spark.sparkContext());
Vinay Limbare
  • 151
  • 2
  • 16