1

I am trying to convert csv files to delta format. The conversion is occurring successfully But I can see the remove property in second json transaction file with details of first csv file in parquet as below: For first json transaction file there is no remove property.

{"remove":{"path":"part-00000-8780-121c6b34a252-c000.snappy.parquet","deletionTimestamp":1597827161514,"dataChange":true}}

I didn't try to delete any file or delete from delta table. Why i am seeing this remove propert while i try converting for new csv to delta files? Any suggestions please?

Vishnu.K
  • 41
  • 2
  • 6
  • Can you take a look at the header in the second json file? It should record which operation generated this commit. – zsxwing Aug 21 '20 at 20:02

2 Answers2

0

try to add .config("spark.databricks.delta.retentionDurationCheck.enabled", "false")

-1

I understood like as I had did the spark "overwrite" mode of saving, it resulted in remove.

Vishnu.K
  • 41
  • 2
  • 6