Questions tagged [delta-live-tables]

Databricks Delta Live Tables (DLT) is the innovative ETL framework that uses a simple declarative approach to building reliable data pipelines and automatically managing your infrastructure at scale.

Delta Live Tables simplifies development of the reliable data pipelines in Python & SQL by providing a framework that automatically handles dependencies between components, enforces the data quality, removes administrative overhead with automatic cluster & data maintenance, ...

149 questions

votes

1 answer

Databricks Delta Live Table - How To Simply Append A Batch Source To a DLT Table?

Using Python and all the relevant DLT properties within Databricks, does anyone know how to simple append to a DLT table from a batch source? In PySpark you can just use df.write.format("delta").mode("append") but since dlt requires you to return a…

pyspark databricks delta-live-tables

asked Jul 30 '22 at 13:34

Luke88

votes

1 answer

Creating a table in Pyspark within a Delta Live Table job in Databricks

I am running a DLT (Delta Live Table) Job that creates a Bronze table > Silver Table for two separate tables. So in the end, I have two separate gold Tables which I want to be merged into one table. I know how to do it in SQL but every time I run…

pyspark databricks delta-lake delta-live-tables data-lakehouse

asked Jul 22 '22 at 14:57

Anton Kopti

votes

1 answer

How to change partition columns in delta live tables?

I first setup a delta live tables using Python as follow @dlt.table def transaction(): return ( spark .readStream .format("cloudFiles") .schema(transaction_schema) .option("cloudFiles.format", "parquet") .load(path) …

databricks delta-live-tables

asked Jun 06 '22 at 04:23

Tse Kit Yam

votes

1 answer

How to use Apache Sedona on Databricks Delta Live tables?

I am trying to run some geospatial transformations in Delta Live Table, using Apache Sedona. I tried defining a minimal example pipeline demonstrating the problem I encounter. First cell of my Notebook, I install apache-sedona Python package: %pip…

python apache-spark pyspark databricks delta-live-tables

asked May 17 '22 at 12:34

Nicolas Jean

votes

1 answer

How to read DeltaLake table using Pyspark

I have a deltalake table ( parquet format) in AWS S3 bucket. I need to read it in a dataframe using Pyspark in notebook code. I tried searching online but no success yet. Can anyone share sample code of how to read a deltalake table in Pyspark (…

python-3.x pyspark parquet delta-lake delta-live-tables

asked May 05 '23 at 12:50

PythonDeveloper

votes

2 answers

How to set up authorization of Delta Live Tables to access Azure Data Lake files?

I am writing delta live tables notebooks in sql to access files from the data lake something like this: CREATE OR REFRESH STREAMING LIVE TABLE MyTable AS SELECT * FROM cloud_files("DataLakeSource/MyTableFiles", "parquet",…

databricks azure-databricks delta-live-tables

asked Apr 06 '23 at 21:05

FAA

votes

1 answer

How to WriteStream Delta live tables to a Kafka topic

In my DLP pipeline, I have three layers - bronze, silver, and gold. The bronze layer reads JSON files from an S3 bucket, while the silver layer performs data processing tasks such as adding new columns. The gold layer is responsible for performing…

apache-spark apache-kafka streaming databricks delta-live-tables

asked Mar 30 '23 at 14:41

LucasVaz97

votes

2 answers

create_streaming_live_table in DLT creates a VIEW instead of a delta table

I have the following piece of code and able to run as a DLT pipeline successfully @dlt.table( name = source_table ) def source_ds(): return spark.table(f"{raw_db_name}.{source_table}") ### Create the target table…

databricks azure-databricks delta-live-tables

asked Mar 01 '23 at 13:09

Yuva

2,831
7
36
60

votes

2 answers

How to make sure values are map to the right delta table column?

I'm writing a PySpark job to read the Values column from table1. Table1 has two column -> ID, Values Sample data in the Values column: +----+-----------------------------------+ | ID | values …

python pyspark databricks spark-streaming delta-live-tables

asked Feb 08 '23 at 13:21

boring-coder

votes

1 answer

Truncate silver delta live table and reload

I have a parameter value which determines whether the table needs to be full load or an incremental load. In delta live tables, incremental load is not an issue as we apply changes and specify whether the table needs to be SCD1 or SCD2. However,…

databricks delta-live-tables

asked Jan 30 '23 at 16:31

RLH

votes

1 answer

How to separate Delta Live Tables production and development targets and repo branches?

I try to replicate some common data & analytics workflows using Delta Live Tables. Currently I am struggling with wrapping my head around on how to achieve below requirements: Have different targets (hive metastore) to write into based on dev or…

databricks delta-live-tables

asked Nov 21 '22 at 20:45

Michael Brenndoerfer

3,483
2
39
50

votes

1 answer

DLT notebook calls the same table definition multiple times

I have a dlt table defined in my DLT notebook that should run exactly once. However, it runs always a couple or more times. It is as simple as this. This gives me errors when defining other tables. Why? Is DLT parallelizing my function and that's…

databricks delta-live-tables

asked Nov 15 '22 at 14:53

David Mirabet

votes

2 answers

SCD-2 using delta live table

Delta live table now has the capability to do SCD Type 2 changes. But after going through this feature, I understood that this will work if I have only one new row with a new effective date. In the scenario where I have two new rows with two…

databricks delta-lake scd delta-live-tables

asked Oct 14 '22 at 17:31

Rajib Deb

1,496
11
30

votes

0 answers

Delta Live Tables using SCD type 1

I'm trying to load data using DLT and SCD 1 and am running into the error message "Detected a data update in the source table at version x. This is currently not supported. If you'd like to ignore updates, set the option 'ignoreChanges' to…

databricks scd delta-live-tables

asked Oct 06 '22 at 13:47

AndyMN

votes

0 answers

Delta Live Tables and ingesting AVRO

So, im trying to load avro files in to dlt and create pipelines and so fourth. As a simple data frame in Databbricks, i can read and unpack to avro files, using functions json / rdd.map /lamba function. Where i can create a temp view then do a sql…

databricks avro delta-live-tables

asked Oct 03 '22 at 15:07

jo80

Prev 1

…

9 10 Next