0

I am trying to use databricks connect.

I have installed databricks-connect version 9.1.39 in a virtual environment within my python project.

I have selected the python3.8 file in the virtual environment as the interpreter of the vscode project. However, when trying to run a file that starts with

from databricks.connect import DatabricksSession

I always get a

ModuleNotFoundError: No module named 'databricks'

Just to make sure this was not due to the module databricks-connect not being in the actual environment when I run it, I actually opened a python shell within the venv and run the same line (from databricks.connect ...) and got the same error.

Why is this happening? Is it due to datarbicks.connect not being related to databricks-connect?

Thanks in advance.

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
MrMuppet
  • 547
  • 1
  • 4
  • 12

1 Answers1

1

The DatabricksSession exists only in the Databricks Connect V2 that is designed for Databricks Runtime 13 or higher. If you use DBR 9.1, then you need to follow up instructions for DBR 11.3 and lower - in this case you need to configure connection details using databricks-connect configure command and just use normal Spark Session creation:

from pyspark.sql.session import SparkSession

spark = SparkSession.builder.getOrCreate()
Alex Ott
  • 80,552
  • 8
  • 87
  • 132