Questions tagged [delta-live-tables]

Databricks Delta Live Tables (DLT) is the innovative ETL framework that uses a simple declarative approach to building reliable data pipelines and automatically managing your infrastructure at scale.

Delta Live Tables simplifies development of the reliable data pipelines in Python & SQL by providing a framework that automatically handles dependencies between components, enforces the data quality, removes administrative overhead with automatic cluster & data maintenance, ...

149 questions
1
vote
1 answer

Databricks - Delta Live Table Pipeline - Ingest Kafka Avro using Schema Registry

I'm new to Azure Databricks and I'm trying implement an Azure Databricks Delta Live Table Pipeline that ingests from a Kafka topic containing messages where the values are SchemaRegistry encoded AVRO. Work done so far... Exercise to Consume and…
1
vote
1 answer

How to prevent adding backslash to JSON string

I would like to read events from eventhub using Databricks, events are in json format but they can have different schema (it's important because i find solutions in which the schema was given to from_json(jsonStr,schema) function, but i cannot use…
1
vote
2 answers

Getting data quality in Delta Live Table (bronze, gold, silver..)

How to check if Delta Live Table is in bronze, gold or silver layer(zone) with python? I have notebook for creating Delta Live Table pipeline, and I need to know what is quality of data(silver, bronze, gold). How to get that information with…
1
vote
0 answers

Log metrics from Databricks to Datadog

What is the best way to log metrics from Databricks Delta Live Tables in DataDog? I create connection between Datadog and Databricks, and I can send logs from Databricks to the Datadog. I have problem with sending logs for Delta Live Tables. For…
1
vote
1 answer

Specify column name AND inferschema on Delta Live Table on Databricks

I'm playing around with the databricks delta live tables feature using the sql api. This is my statement so far: --Create Bronze Landing zone table CREATE STREAMING LIVE TABLE raw_data COMMENT "mycomment" TBLPROPERTIES ("quality" = "bronze") AS…
Jamalan
  • 482
  • 4
  • 15
1
vote
0 answers

Databricks delta live tables stuck when ingest file from S3

I'm new to databricks and just created a delta live tables to ingest 60 millions json file from S3. However the input rate (the number of files that it read from S3) is stuck at around 8 records/s, which is very low IMO. I have increased the number…
0
votes
0 answers

Watermarking syntax with databricks spark sql

Documentation seems to be very sparse on Spark SQL watermaking syntax. I am able to find plenty on PySpark watermarking, but I am trying to accomplish this in SQL for a Databricks Delta Live table. Below is the syntax I am using for said table that…
0
votes
0 answers

How to create dlt streaming live table using python in databricks

I have created a streaming live table using sql as below: CREATE STREAMING LIVE TABLE customers_count_streaming COMMENT "count of customers" TBLPROPERTIES ("myCompanyPipeline.quality" = "gold") AS SELECT count(*) as customers_count FROM…
Aseem
  • 5,848
  • 7
  • 45
  • 69
0
votes
0 answers

Databricks Delta Live Tables Just Overwrite after CDC and SCD?

I am facing the following issue. I use DLT to develop a pipeline with a Multihop-Architecture. For the ingestion into the Bronze tables i use the autoloader functionality (the source is S3). For the silver table with my customer data i use the…
0
votes
0 answers

databricks / spark - escape character in column value \ - splitting column

Having a very frustrating time trying trying to get spark not to split my column when it has a \ in the column value. what do I need to adjust to do this. Not getting it thank you :) datapath = "abfss:/location" df = (spark .read …
0
votes
1 answer

Pass parameters from Azure Data Factory to Delta Live Table pipeline

I'm trying to pass some variables from Azure Data Factory to Delta live tables by passing them in the body of the web activity i'm using to call the DLT API. {"fullRefresh":true,"configuration":{"ingestionsourceName":"XXX"}} The same code provided…
0
votes
1 answer

DataBricks Delta Live Tables Expectations: How to dynamically execute @dlt.expect()

I tried the following code, but results from running DLT pipeline in an error: if kwargs.get("df_tableoperation", None) is not None : ^ SyntaxError: invalid syntax The idea is to dynamically generate a series of tables from configurable metadata…
0
votes
1 answer

Databricks delta live tables from Azure data stream error - no tables were discovered

Attempted to update an empty pipeline. This error usually means that no tables were discovered in your specified Notebook libraries. Please verify that your Notebook libraries include table definitions. DLT notebook code %python import…
Aseem
  • 5,848
  • 7
  • 45
  • 69
0
votes
1 answer

Is Delta Live Table available with Scala?

Documentation about Delta live tables says nothing about Scala support, only Python and SQL are mentioned https://learn.microsoft.com/en-us/azure/databricks/delta-live-tables/. In Databricks guide no scala too…
alsetr
  • 13
  • 3
0
votes
1 answer

Trigger Databricks Delta Live Tables Pipeline from Synapse

We would like to trigger and run a Databricks Delta Live Tables Pipeline from an Azure Synapse pipeline which creates a couple of bronze and silver tables. I cant find any info on that. However, it is possible to run Databricks notebooks in Synapse…
marritza
  • 22
  • 5