I am currently coding pyspark pipelines using databricks connect library. The steps I followed are given here. This library has been installed in a virtual environment.
When I try to execute this code
spark.read.load(path).first()
I get this error
<class 'TypeError'>, 'JavaPackage' object is not callable, <traceback object at 0x0000017AB70ECF88>
Traceback (most recent call last):
File "D:/Friendsurance/Repository/data-ingestion/job/main.py", line 83, in <module>
run()
File "D:/Friendsurance/Repository/data-ingestion/job/main.py", line 79, in run
el_job.run()
File "D:\Friendsurance\Repository\data-ingestion\job\task\__init__.py", line 18, in run
data: DataFrame = self.extract()
File "D:\Friendsurance\Repository\data-ingestion\job\task\ELTask.py", line 14, in extract
return self.extractor.extract()
File "D:\Friendsurance\Repository\data-ingestion\job\task\extractor\BucketExtractor.py", line 26, in extract
self.spark, self.load_storage.get_path(), self.conf.partition_column
File "D:\Friendsurance\Repository\data-ingestion\job\task\extractor\__init__.py", line 14, in calculate_last_day_run
spark.read.load(path).first().show()
File "D:\Friendsurance\Repository\data-ingestion\venv\lib\site-packages\pyspark\sql\dataframe.py", line 1381, in first
return self.head()
File "D:\Friendsurance\Repository\data-ingestion\venv\lib\site-packages\pyspark\sql\dataframe.py", line 1369, in head
rs = self.head(1)
File "D:\Friendsurance\Repository\data-ingestion\venv\lib\site-packages\pyspark\sql\dataframe.py", line 1371, in head
return self.take(n)
File "D:\Friendsurance\Repository\data-ingestion\venv\lib\site-packages\pyspark\sql\dataframe.py", line 657, in take
return self.limit(num).collect()
File "D:\Friendsurance\Repository\data-ingestion\venv\lib\site-packages\pyspark\sql\dataframe.py", line 596, in collect
if self._sc._conf.get(self._sc._jvm.PythonSecurityUtils.USE_FILE_BASED_COLLECT()):
TypeError: 'JavaPackage' object is not callable
But when I am out of the virtual environment where I am using the pyspark library provided here, I am able to execute the same line and it gives me the output.
Can anyone please tell me where I am going wrong with this?