Questions tagged [delta-live-tables]

Databricks Delta Live Tables (DLT) is the innovative ETL framework that uses a simple declarative approach to building reliable data pipelines and automatically managing your infrastructure at scale.

Delta Live Tables simplifies development of the reliable data pipelines in Python & SQL by providing a framework that automatically handles dependencies between components, enforces the data quality, removes administrative overhead with automatic cluster & data maintenance, ...

149 questions
1
vote
0 answers

Dynamically get the database name from a Databricks DLT pipeline

How do I dynamically get the database name from a DLT pipeline?
jencake
  • 121
  • 5
1
vote
1 answer

How do we test notebooks that use delta live table

I cannot execute the delta live table code in the notebook. I always have to create a DLT pipeline by going into workflows tab. Is there a easy way to test the delta live table code in notebook Thanks
Rajib Deb
  • 1,496
  • 11
  • 30
1
vote
1 answer

The purpose of having TBLPROPERTIES in create table

What is the purpose of using TBLPROPERTIES("quality" = "silver") while creating table using CREATE STREAMING LIVE TABLE... syntax. Is it just to tag the table that it is a silver table or does it drive anything else during data processing
Rajib Deb
  • 1,496
  • 11
  • 30
1
vote
0 answers

Streaming not working in Delta Live table pipeline (Databricks)?

I am working on a pipeline in Databricks > Workflows > Delta Live Tables and having an issue with the streaming part. Expectations: One bronze table reads the json files with AutoLoader (cloudFiles), in a streaming mode (spark.readStream) One…
1
vote
2 answers

ModuleNotFoundError: No module named 'dlt' error when running Delta Live Tables Python notebook

When attempting to create a Python notebook and follow the various examples for setting up databricks delta live tables, you will immediately be met with the following error if you attempt to run your notebook: ModuleNotFoundError: No module named…
Alain
  • 26,663
  • 20
  • 114
  • 184
1
vote
1 answer

Apply Changes from a delta live streaming table to another delta live streaming table

I get the below error when I try to do APPLY CHANGES from one delta live streaming table to another delta live stremaing table. Is this scenario not supported? pyspark.sql.utils.AnalysisException: rajib_db.employee_address_stream is a permanent…
Rajib Deb
  • 1,496
  • 11
  • 30
1
vote
1 answer

Lambda to trigger a DLT pipeline

I have a source file that I load into S3 as a delta file. I can attach a lambda trigger to the file. Is there a way to trigger a DLT pipeline based on the lambda trigger
Rajib Deb
  • 1,496
  • 11
  • 30
1
vote
1 answer

What type of clusters does DLT pipeline use

While creating DLT pipeline, we do not specify any cluster. Does DLT pipeline automatically spin up clusters. If yes, what type of clusters does it spin up?
Rajib Deb
  • 1,496
  • 11
  • 30
1
vote
2 answers

How to use delta live table with google cloud storage

[Cross-posting from databrick's community : link] I have been working on a POC exploring delta live table with GCS location. I have some doubts : how to access the gcs bucket. We have to establish connection using databricks service account. In a…
1
vote
1 answer

Delta live tables data quality checks -Retain failed records

There are 3 types of quality checks in Delta live tables: expect (retain invalid records) expect_or_drop (drop invalid records) expect_or_fail (fail on invalid records) I want to retain invalid records, but I also want to keep track of them. So,…
1
vote
1 answer

Databricks Auto Loader with Merge Condition

We have the following merge-to-delta function. The merge​ function ensures we update the record appropriately based on certain conditions. So, in the function usage, you can see we define the merge condition and pass it into the function. This…
1
vote
1 answer

Delta Live Table - DLT pipeline is getting struck at initialising state

I've a DLT pipeline where in it creates Delta table by reading from sql server and then we call few apis to update metadata in our cosmos. Whenever we start it, it gets struck in initialising state. But when we run same code using interactive…
Ravindra
  • 31
  • 2
1
vote
1 answer

Azure Event Hub to ensure read data only once with failure handling

Hey folks I am working on use case where I am implementing updated/incremental updates to delta tables through event hubs in Azure cloud. I came across event hubs and delta live tables which would be necessary. I have an HVR agent at start which…
1
vote
1 answer

DLT: commas treated as part of column name

I am trying to create a STREAMING LIVE TABLE object in my DataBricks environment, using an S3 bucket with a bunch of CSV files as a source. The syntax I am using is: CREATE OR REFRESH STREAMING LIVE TABLE t1 COMMENT "test table" TBLPROPERTIES ( …
Piotr L
  • 1,065
  • 1
  • 12
  • 29
1
vote
1 answer

Azure Databricks Delta live table

Azure Databricks Delta live table tab is missing from my Databricks notebooks. Why?