0

I created a delta table in Databricks using sql as:

%sql
create table nx_bronze_raw
(
  `Device` string
)
USING DELTA LOCATION '/mnt/Databricks/bronze/devices/';

Then I ingest data (device column) into this table using:

bronze_path = '/mnt/Databricks/bronze/devices/'
df.select('Device').write.format("delta").mode("append").save(bronze_path)

The underlying storage is Azure Blob Storage, and the Databricks runtime is 12.1

The problem is when I query this table it returns 0 records:

df_read = spark.read.format("delta").load("/mnt/Databricks/bronze/devices/")
display(df_read )

Query returned no results

Although, when I look inside the storage account, the delta files are created with the expected size: enter image description here

What went wrong in this scenario, especially no error is returned ? and why can't I retrieve the data ?

khidir sanosi
  • 161
  • 11

1 Answers1

0

Following are the possible reasons for getting empty results.

  • If you are having empty dataset.

enter image description here

  • If you truncate the table in between writing and reading the table.

Before: enter image description here

After:

enter image description here

In this case, you can describe history of table and retrieve data of specific version.

enter image description here

Here, selecting the version before it is truncated.

df_read = spark.read.format("delta").option("versionAsOf",3).load("/mnt/Databricks/bronze/devices2/")
display(df_read )

enter image description here

  • There is a chance that, the data has been written to the Delta files but hasn't been flushed to the table yet. To ensure that all changes are visible, you can try running OPTIMIZE

code:

%sql
optimize raw;

enter image description here

JayashankarGS
  • 1,501
  • 2
  • 2
  • 6