I have a notebook using which i am doing a history load. Loading 6 months data everytime, starting with 2018-10-01
.
My delta file is partitioned by calendar_date
After the initial load i am able to read the delta file and look the data just fine.
But after the second load for date 2019-01-01 to 2019-06-30
, the previous partitons are not loading normally using delta format.
Reading my source delta file like this throws me error saying
file dosen't exist
game_refined_start = (
spark.read.format("delta").load("s3://game_events/refined/game_session_start/calendar_date=2018-10-04/")
)
However reading like below just works fine any idea what could be wrong
spark.conf.set("spark.databricks.delta.formatCheck.enabled", "false")
game_refined_start = (
spark.read.format("parquet").load("s3://game_events/refined/game_session_start/calendar_date=2018-10-04/")
)