I am packing the following code in a whl file:
from pkg_resources import resource_filename
def path_to_model(anomaly_dir_name: str, data_path: str):
filepath = resource_filename(anomaly_dir_name, data_path)
return filepath
def read_data(spark) -> DataFrame:
return (spark.read.parquet(str(path_to_model("sampleFolder", "data"))))
I confirmed that the whl file contains the parquet files under sampleFolder/data/ directory correctly. When i run this locally it works, but when i upload this whl file to dbfs and run then i get this error:
AnalysisException: Path does not exist: dbfs:/databricks/python/lib/python3.7/site-packages/sampleFolder/data;
I confirmed that this directory actually does not exist: dbfs:/databricks/python Any idea what this error could be?
Thanks.