0

I have updated an Azure Databricks cluster from runtime 5.5LTS to 7.3LTS. Now I'm getting an error when I debug in VSCode. I have updated my Anaconda connection like this:

> conda create --name dbconnect python=3.7
> conda activate dbconnect
> pip uninstall pyspark
> pip install -U databricks-connect==7.3.*
> databricks-connect configure
> databricks-connect test

So far so good, but now I'm trying to debug the following

from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()
setting = spark.conf.get("spark.master")

if "local" in setting:
    from pyspark.dbutils import DBUtils
    dbutils = DBUtils(spark.sparkContext)

On the dbutils = DBUtils(spark.sparkContext), it throws an exception

Exception has occurred: AttributeError 'SparkContext' object has no attribute 'conf'

I have tried creating the conf

from pyspark.dbutils import DBUtils
import pyspark
conf = pyspark.SparkConf()
pyspark.SparkContext.getOrCreate(conf=conf)
dbutils = DBUtils(spark.sparkContext)

but I still get the same error. Can someone tell me what I'm doing wrong please?

blackbishop
  • 30,945
  • 11
  • 55
  • 76
Connell.O'Donnell
  • 3,603
  • 11
  • 27
  • 61

1 Answers1

5

From the docs Access DBUtils, you need to pass the SparkSession spark not the SparkContext :

from pyspark.sql import SparkSession
from pyspark.dbutils import DBUtils

spark = SparkSession.builder.getOrCreate()

dbutils = DBUtils(spark)
blackbishop
  • 30,945
  • 11
  • 55
  • 76