0

I wrote the following code in python:

val input = spark.read.format("csv").option("header",true).load("input_path")


input.write.format("delta")
       .partitionBy("col1Name","col2Name")
       .mode("overwrite")
       .save("output_path")

input is read properly and we have the col1Name and col2Name.

The problem here is that the data is written in parquet and the _delta_log folder stays empty(no items) so when I try to read the delta data I get an an error : 'output_path` is not a Delta table How can I change the code to make the data properly data written in data with the _delta_log folder properly filled in?

I am using the following different conf in the databricks cluster:Apache Spark 2.4.5, Scala 2.11 and Apache Spark 3.1.2, Scala 2.12) but all of them give the same result.

Any idea how to fix this please?

Thanks a lot

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
scalacode
  • 1,096
  • 1
  • 16
  • 38

0 Answers0