I am trying to integrate PyCharm with my team's Databricks cluster using databricks-connect. I created a venv and did a pip install databricks-connect. Then I configured it with the necessary info of databricks cluster. When I run databricks-test, I see the following error. I followed this databricks documentation. I am using Python 3.8 and databricks-connect 9.1.(since the cluster runs this version). I tried changing version of databricks-connect to 10.4. but the same issue shows up again, been experimenting with different versions(tried creating a virtual environment through Anaconda prompt, but that too ran into the same issue). Any help is appreciated.
(venv) C:\Users\xxxxxxxx>databricks-connect test
* PySpark is installed at C:\Users\xxxxxxxx\Documents\pythonProject\venv\lib\site-packages\pyspark
* Checking SPARK_HOME
* Checking java version
java version "1.8.0_341"
Java(TM) SE Runtime Environment (build 1.8.0_341-b10)
Java HotSpot(TM) 64-Bit Server VM (build 25.341-b10, mixed mode)
* Skipping scala command test on Windows
* Testing python command
Traceback (most recent call last):
File "C:\Users\xxxxxxxx\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\xxxxxxxx\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\xxxxxxxx\Documents\pythonProject\venv\Scripts\databricks-connect.exe\__main__.py", line 7, in <module>
File "C:\Users\xxxxxxxx\Documents\pythonProject\venv\lib\site-packages\pyspark\databricks_connect.py", line 283, in main
test()
File "C:\Users\xxxxxxxx\Documents\pythonProject\venv\lib\site-packages\pyspark\databricks_connect.py", line 248, in test
spark = SparkSession.builder.getOrCreate()
File "C:\Users\xxxxxxxx\Documents\pythonProject\venv\lib\site-packages\pyspark\sql\session.py", line 229, in getOrCreate
sc = SparkContext.getOrCreate(sparkConf)
File "C:\Users\xxxxxxxx\Documents\pythonProject\venv\lib\site-packages\pyspark\context.py", line 392, in getOrCreate
SparkContext(conf=conf or SparkConf())
File "C:\Users\xxxxxxxx\Documents\pythonProject\venv\lib\site-packages\pyspark\context.py", line 145, in __init__
SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
File "C:\Users\xxxxxxxx\Documents\pythonProject\venv\lib\site-packages\pyspark\context.py", line 339, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway(conf)
File "C:\Users\xxxxxxxx\Documents\pythonProject\venv\lib\site-packages\pyspark\java_gateway.py", line 101, in launch_gateway
proc = Popen(command, **popen_kwargs)
File "C:\Users\xxxxxxxx\AppData\Local\Programs\Python\Python38\lib\subprocess.py", line 858, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Users\xxxxxxxx\AppData\Local\Programs\Python\Python38\lib\subprocess.py", line 1334, in _execute_child
_winapi.CloseHandle(ht)
OSError: [WinError 6] The handle is invalid
Exception ignored in: <function Popen.__del__ at 0x000002CA0B15C0D0>
Traceback (most recent call last):
File "C:\Users\xxxxxxxx\AppData\Local\Programs\Python\Python38\lib\subprocess.py", line 949, in __del__
self._internal_poll(_deadstate=_maxsize)
File "C:\Users\xxxxxxxx\AppData\Local\Programs\Python\Python38\lib\subprocess.py", line 1348, in _internal_poll
if _WaitForSingleObject(self._handle, 0) == _WAIT_OBJECT_0:
OSError: [WinError 6] The handle is invalid
Exception ignored in: <function Handle.Close at 0x000002CA0B15B3A0>
Traceback (most recent call last):
File "C:\Users\xxxxxxxx\AppData\Local\Programs\Python\Python38\lib\subprocess.py", line 194, in Close
CloseHandle(self)
OSError: [WinError 6] The handle is invalid