streamWriter with format(delta) is not producing a delta table

Question

I am using AutoLoader in databricks. However when I save the stream as a delta table, the generated table is NOT delta.

  .writeStream
  .format("delta")  # <-----------
  .option("checkpointLocation", checkpoint_path)
  .option("path", output_path)
  .trigger(availableNow=True)
  .toTable(table_name))

delta.DeltaTable.isDeltaTable(spark, table_name)
> false

Why is the generated table not delta format? If I try to read the table using spark.read(table_name) it works but if I am trying to use Redash or the builtin databricks' Data tab it produces an error and the schema is not well parsed.

An error occurred while fetching table: table_name com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: Incompatible format detected A transaction log for Databricks Delta was found at s3://delta/_delta_log, but you are trying to read from s3://delta using format("parquet"). You must use 'format("delta")' when reading and writing to a delta table.

score 0 · Answer 1 · answered Jan 16 '23 at 17:02

0

Could you try this:

(
     spark
    .writeStream                                                
    .option("checkpointLocation", <checkpointLocation_path>)
    .trigger(availableNow=True)
    .table("<table_name>")
)

Instead of toTable can you try table

answered Jan 16 '23 at 17:02

Sharma

303
2
15

didn't help :-( – Hanan Shteingart Jan 17 '23 at 16:41

streamWriter with format(delta) is not producing a delta table

1 Answers1