Questions tagged [delta-live-tables]

Databricks Delta Live Tables (DLT) is the innovative ETL framework that uses a simple declarative approach to building reliable data pipelines and automatically managing your infrastructure at scale.

Delta Live Tables simplifies development of the reliable data pipelines in Python & SQL by providing a framework that automatically handles dependencies between components, enforces the data quality, removes administrative overhead with automatic cluster & data maintenance, ...

149 questions
1
vote
1 answer

Sink from Delta Live Table to Kafka, initial sink works, but any subsequent updates fail

I have a DLT pipeline that ingests a topic from my kafka stream, transforms it into a DLT, then I wish to write that table back into Kafka under a new topic. So far, I have this working, however it only works on first load of the table, then after…
1
vote
0 answers

DLT table created in target is not queryable

I've created a DLT table by specifying database in the target, but while querying the same using SQL endpoint, its throwing exception. 'Failure to initialize configuration' Please note that source is defined as ADLS. The DLT table should be…
1
vote
2 answers

Streaming from a Delta Live Tables in databrick to kafka instance

I have the following live table And i'm looking to write that into a stream to be written back into my kafka source. I've seen in the apache spark docs that I can use writeStream ( I've used readStream to get it out of my kafka stream already ).…
1
vote
1 answer

Costs Databricks Delta Live Tables

Will a Databricks Delta Live Table generate costs regardless of finding data to load? And would a solution in that case be to set the job to disabled if you know new data is not filling the source for a while?
1
vote
1 answer

Passing Parameters to DLT task in Databricks workflows

I am making use of Databricks Workflows. I have a job that consists of three tasks: Extract - references a normal databricks notebook DLT - references a dlt pipeline PostExec - references a normal databricks notebook I pass a parameter into the…
HGe
  • 11
  • 1
1
vote
1 answer

Delta live tables data quality checks

I'm using delta live tables from Databricks and I was trying to implement a complex data quality check (so-called expectations) by following this guide. After I tested my implementation, I realized that even though the expectation is failing, the…
1
vote
1 answer

delta live tables dump final gold table to cassandra

we have a delta live tables which read from kafka topic, clean/filter/process/aggregate the message, and dump it to bronze/silver/gold table, in order to build a REST service to retrieve the aggregated result, we need to dump the data from gold…
user468587
  • 4,799
  • 24
  • 67
  • 124
1
vote
1 answer

Spark read stream from kafka using delta tables

i'm trying to read stream from Kafka topic using spark streaming/python, I can read the message and dump it to a bronze table with default kafka message schema, but i cannot cast the key and values from binary to string, I've tried the following…
user468587
  • 4,799
  • 24
  • 67
  • 124
1
vote
1 answer

how to create Materialized view for a existing table in delta live table in Databricks using pyspark?

Here I am trying to create a materialized view for existing Delta Live Table using pyspark, I tried many times but it shows error every time. Can we create materialized view in DLT?, If so please share the resource? Thank you!
1
vote
2 answers

Is there a way to join two Live Tables on Delta Live Tables using Python?

I want to be join in two silver tables LIVE tables that are being streamed to create a gold table, however, I have run across multiple errors including "RuntimeError("Query function must return either a Spark or Koalas DataFrame") RuntimeError:…
1
vote
0 answers

How to test a DLT pipeline inorder to achieve TDD approch

As per this document: https://learn.microsoft.com/en-us/azure/databricks/release-notes/product/2021/may#create-and-manage-etl-pipelines-using-delta-live-tables-public-preview DLT enables the test driven development approach for creating and managing…
1
vote
1 answer

References to Streaming Delta Live Tables

It was my understanding that references to streaming delta live tables require the use of the function STREAM(), supplying the table name as an argument. Given below is a code snippet that I found in one of the demo notebooks that Databricks…
Minura Punchihewa
  • 1,498
  • 1
  • 12
  • 35
1
vote
1 answer

Ambiguous reference to fields StructField in Databricks Delta Live Tables

I have set up Auto Loader to regularly read json files and store them in a "bronze" table called fixture_raw using Delta Live Tables in Databricks. This works fine and the json data is stored in the specified table, but when I add a "silver" table…
1
vote
0 answers

Delta Live Tables pipeline running time

New to Databricks Delta Live Tables. Set up my first pipeline to ingest a single 26Mb CSV file from an Azure blob using the following code: import dlt @dlt.table( comment="this is a test" ) def accounts(): return ( …
1
vote
1 answer

Transform column names of Delta Live Tables in Databricks

I am ingesting a csv file from a mounted blob storage to delta live table and here's my initial query: CREATE INCREMENTAL LIVE TABLE table_raw COMMENT "Ingesting data from /mnt/foo" TBLPROPERTIES ("quality" = "bronze") AS SELECT * FROM…