Questions tagged [delta-live-tables]

Databricks Delta Live Tables (DLT) is the innovative ETL framework that uses a simple declarative approach to building reliable data pipelines and automatically managing your infrastructure at scale.

Delta Live Tables simplifies development of the reliable data pipelines in Python & SQL by providing a framework that automatically handles dependencies between components, enforces the data quality, removes administrative overhead with automatic cluster & data maintenance, ...

149 questions

votes

0 answers

Overwrite Scheme on Delta Live Tables workflow

I am new to Delta Live Tables and have been working with a relatively simple pipeline. The table that I am having an issue is as follows: @dlt.table( table_properties={ "quality" : "silver" } ) def silver_catalog_product(): …

asked Sep 14 '22 at 12:38

Oliver

35,233
12
66
78

votes

1 answer

Delta Live Table able to write to ADLS?

I have a architectural requirement to have the data stored in ADLS under a medallion model, and are trying to achieve writing to ADLS using Delta Live Tables as a precursor to creating the Delta Table. I've had had success using CREATE TABLE…

azure-databricks delta-lake azure-data-lake-gen2 delta-live-tables

asked Aug 29 '22 at 17:23

BambooParachute

votes

1 answer

Use DLT table from one pipeline in another pipeline

If I have a DLT pipeline that creates a streaming live table called customers, how can I use that table in another pipeline? So, Pipeline A: CREATE OR REFRESH STREAMING LIVE TABLE customers AS Pipeline B: CREATE OR REFRESH STREAMING LIVE TABLE…

databricks delta-live-tables

asked Aug 16 '22 at 13:35

AndyMN

votes

0 answers

Incrementallly reading and aggregating parquet files from S3 using Databricks DLT

I am trying to use DLT for incremental processing where inputs are parquet files on s3 arriving daily. I am told that dlt read_stream can help . I was able to get incrementally read files, but when I perform aggregations, it is doing wide…

databricks parquet autoload delta-live-tables

asked Aug 15 '22 at 15:43

developer developer

votes

1 answer

How can I control the order of Databricks Delta Live Tables' (DLT) creation for pipeline development?

I am developing a Databricks Pipeline, writing my DLTs in Python. I want to understand how to control the Pipeline's order of creation of DLTs. Currently, the Pipeline attempts to create every single DLT in the order that they're written in,…

databricks azure-databricks delta-live-tables

asked Aug 02 '22 at 16:29

JJ Kam

votes

3 answers

Delta live tables - Slowly changing dimensions

Is it possible to create an Slowly Changing Dimension mechanism using Delta Live Tables? I would like to implement something like this https://docs.databricks.com/_static/notebooks/merge-in-scd-type-2.html But in the DLT docs i found "Processing…

pyspark databricks azure-databricks scd delta-live-tables

asked Jun 01 '22 at 09:48

repcak

votes

1 answer

Databricks - Read Streams - Delta Live Tables

I have a number of tables (with varying degrees of differences in schemas but with a common set of fields) that I would like to Union and load from bronze -> Silver in an incremental manner. So the goal is to go from multiple tables to a single…

pyspark databricks delta-lake delta-live-tables

asked Mar 22 '22 at 22:10

Trista_456

vote

1 answer

Move managed DLT table from one schema to another schema in Databricks

I have a DLT table in schema A which is being loaded by DLT pipeline. I want to move the table from schema A to schema B, and repoint my existing DLT pipeline to table in schema B. also I need to avoid full reload in DLT pipeline on table in Schema…

databricks delta-lake aws-databricks delta-live-tables

asked Aug 09 '23 at 16:18

Athi

vote

1 answer

Limited options for source code path for Delta Live Tables (DLT)

When I run jobs, I can point to a file on Github or Azure Devops, and specify the branch I want the job to read from. However when I create a DLT pipeline, I can point only to files on Databricks, and I cannot specify a branch. Pointing to a shared…

databricks delta-live-tables

asked Aug 07 '23 at 15:17

Oliver Angelil

1,099
15
31

vote

1 answer

How to obtain a direct way to differentiate between a full refresh and an incremental update for Delta live table?

I have a tables that travels from Bronze - silver - gold, I want to implement some function like 'is_full_refresh()' so the pipeline filters the df depending on the output, if it's a full, don't filter, if it's incremental filter by a,b,c Checking…

pyspark databricks delta-lake delta-live-tables

asked Jul 24 '23 at 10:37

Leonardo Lima

vote

1 answer

How to find namespace of tables in Delta Live Tables to query?

I created a pipeline using Delta Live Tables. How to know the namespace of the tables? The name of this DLT pipeline is "dlt_test", then I tried select * from dlt_test.live_gold select * from dlt_test_dlt_db.live_gold However, both failed and…

databricks delta-live-tables

asked Jul 17 '23 at 13:01

user3692015

vote

0 answers

Incremental ingestion of Snowflake data with Delta Live Table (CDC)

I have some data which are lying into Snowflake, so I want to apply CDC on them using delta live table but I am having some issues. Here is what I am trying to do: @dlt.view() def table1(): return…

pyspark snowflake-cloud-data-platform databricks delta-live-tables

asked Jul 14 '23 at 13:14

Khalil Fall

vote

0 answers

How to capture dropped events in PySpark structural streaming job

I have a PySpark streaming job which drops duplicate events by a session id. I have a watermarking window of 30 min. Snippet: unique_df = df.withColumn("timestamp", current_timestamp()).dropDuplicates(session_id).withWatermarking("timestamp", 30) I…

pyspark databricks spark-streaming delta-live-tables

asked Jul 08 '23 at 05:41

boring-coder

vote

1 answer

Effect of "table_properties" property for Delta Live Tables

I have the following code: @dlt.table( name="ingested_data", comment="Ingest the table", table_properties={ "quality": "raw", "name": "property_name" } ) I am confused what the table_properties dictionary does in practice? I…

databricks delta-live-tables

asked Jul 06 '23 at 15:02

Oliver Angelil

1,099
15
31

vote

1 answer

Databricks Delta Live Tables (DLT) file format (notebooks or .py files?)

I noticed that it is possible to write DLT pipelines in both Databricks notebooks and .py files. Is there a recommended approach?

databricks delta-live-tables

asked Jul 05 '23 at 13:46

Oliver Angelil

1,099
15
31

Prev 1 2

…

9 10 Next