0

I am running pyspark in my PC (windows 10) but I can not import HiveContext:

from pyspark.sql import HiveContext
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-25-e3ae767de910> in <module>
----> 1 from pyspark.sql import HiveContext

ImportError: cannot import name 'HiveContext' from 'pyspark.sql' (C:\spark\spark-3.0.0-preview-bin-hadoop2.7\python\pyspark\sql\__init__.py)

How I should proceed to resolve it?

halfer
  • 19,824
  • 17
  • 99
  • 186
user8270077
  • 4,621
  • 17
  • 75
  • 140
  • 1
    Following the helpful remarks from Oliver, I have rolled this back to the original version. Questions must not metamorphose to different questions once the original problem has been solved, as this risks invalidating existing answers. – halfer Dec 04 '19 at 23:54

1 Answers1

2

You’re using the preview release of Spark 3.0. According to the release notes, you should use SparkSession.builder.enableHiveSupport().

In Spark 3.0, the deprecated HiveContext class has been removed. Use SparkSession.builder.enableHiveSupport() instead.

Oliver W.
  • 13,169
  • 3
  • 37
  • 50
  • Please don't update questions after they have been answered. Instead, create new questions. That being said, you must call `enableHiveSupport()` in the same chain where you create the actual `SparkSession`, not afterwards. So, `spark = SparkSession.builder.appName("foo").enableHiveSupport().getOrCreate()`. Then run `spark.sql("some sql statement')`. – Oliver W. Dec 01 '19 at 14:12
  • I have tolled the question back to its original state. – halfer Dec 04 '19 at 23:54