0

I installed databrick-connect in a conda enviroment, without having pyspark installed (I read that having pyspark would crash with the installation of databricks-connect). After finishing the configuration of databricks-connect with the cluster,port... info. I tried to run pyspark within the conda enviroment but it does not work:

Traceback (most recent call last):
  File "C:\Users\Name\Anaconda3\envs\conda_env1\Scripts\find_spark_home.py", line 86, in <module>
    print(_find_spark_home())
  File "C:\Users\Name\Anaconda3\envs\conda_env1\Scripts\find_spark_home.py", line 52, in _find_spark_home
    module_home = os.path.dirname(find_spec("pyspark").origin)
AttributeError: 'NoneType' object has no attribute 'origin'
The system cannot find the path specified.
The system cannot find the file specified.
The system cannot find the file specified.
The system cannot find the path specified.

Additional info: I'm using Windows 10, windows power shell to run my commands. Java8, Hadoop 3-3.4, databricks-connect==9.1 LTS, python 3.8.

Any ideas what could be the problem ?

the phoenix
  • 641
  • 7
  • 15

1 Answers1

0

Please set SPARK_HOME. get SPARK_HOME by executing databricks-connect get-spark-home

  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Apr 26 '23 at 14:01