7

I am getting the following error for the code below, please help:

   from delta.tables import *
   ModuleNotFoundError: No module named 'delta.tables'
   INFO SparkContext: Invoking stop() from shutdown hook

Here is the code: '''

   from pyspark.sql import *

   if __name__ == "__main__":
     spark = SparkSession \
        .builder \
        .appName("DeltaLake") \
        .config("spark.jars", "delta-core_2.12-0.7.0") \
        .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension") \
        .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog") \
        .getOrCreate()

    from delta.tables import *

    data = spark.range(0, 5)

   data.printSchema()

'''

An online search suggesting verifying the scala version to delta core jar version. Here is the scala & Jar versions

"delta-core_2.12-0.7.0"

"Using Scala version 2.12.10, Java HotSpot(TM) 64-Bit Server VM, 1.8.0_221"

RLT
  • 141
  • 1
  • 2
  • 7

3 Answers3

8

or you can also

pip install delta-spark

delta-spark pip page

Stephen
  • 149
  • 1
  • 8
4

According to delta package documentation, there is a python file named tables. You should clone the repository and copy the delta folder under python/delta to your site packages path (i.e. ..\python37\Lib\site-packages). then restart python and your code runs without the error.

I am using Python3.5.3, pyspark==3.0.1,

Or b
  • 666
  • 1
  • 5
  • 22
  • Your answer helped. https://docs.delta.io/latest/quick-start.html - Here is where it is – RLT Jan 03 '21 at 20:46
  • 1
    The provided link doesn't point to the correct repository. Take a look at https://github.com/delta-io/delta and you will see that `tables` actually exists within the delta Python package. – Bram Jan 07 '21 at 14:36
  • 1
    @Bram thanks answer modified with instructions on how to do it. – Or b Jan 08 '21 at 11:18
4

There is a difference between spark.jars and spark.jars.packages. Since you are following the Quick Start, try replacing

.config("spark.jars", "delta-core_2.12-0.7.0")

with

.config("spark.jars.packages", "io.delta:delta-core_2.12:0.7.0")
Bram
  • 376
  • 1
  • 4