1

I have the following set up in my company, and I like to use databrick on my windows machine with Pycharm

  • I have a databrick cluster as shown below version 7.3
  • Java 8
  • Spark in C drive on this path C:\spark-3.2.0-bin-hadoop3.2\spark-3.2.0-bin-hadoop3.2 (but I unset it from SPARK_HOME environment variable)
  • PyCharm installed
  • Conda installed
  • pip install -U databricks-connect==7.* (based on run time version that I see in the figure below)
  • following configuration information: cluster URL, cluster id, organization ID, databricks token
  • I dont know the port number but I am keeping default 15001 Fig 1

Then I use the following command to make the configuration

databricks-connect configure

and then

databricks-connect test

enter image description here

Not working, and pauses here with

The system cannot find the path specified.

I dont know what path I am missing

I played with SPARK_HOME variable, and was throwing following errors which was asking me to unset it.

enter image description here

enter image description here

following instructions were followed [1] [2] [3]

Ali Saberi
  • 864
  • 1
  • 10
  • 33
  • Did you run `databricks-connect test` from your dbconnect environment? (As it seems to me it fails at the python command). As for `SPARK_HOME`, do not install `pyspark` separately as this will indeed conflict. You should only install `databricks-connect` (which contains `pyspark`) – Jyr Jan 22 '22 at 19:18
  • @Jyr I ran `databricks-connect test` in terminal of project. Is it wrong? If yes how do I get into `dbconnect environment` and run the `databricks-connect test` ? – Ali Saberi Jan 23 '22 at 02:02
  • conda activate dbconnect, but Pycharm should do this automatically. – Jyr Jan 23 '22 at 10:04
  • @Jyr I am already in Pycharm terminal :| All of above are in Pycharm terminal – Ali Saberi Jan 23 '22 at 13:51

0 Answers0