Questions tagged [azure-databricks]

For questions about the usage of Databricks Lakehouse Platform on Microsoft Azure

Overview

Azure Databricks is the Azure-based implementation of Databricks, which is a high-level platform for working with Apache Spark and includes Jupyter-style notebooks.

Azure Databricks is a first class Azure service and natively integrates with other Azure services such as Active Directory, Blob Storage, Cosmos DB, Data Lake Store, Event Hubs, HDInsight, Key Vault, Synapse Analytics, etc.

Related Tags

4095 questions

vote

1 answer

Overwrite/remove delta table in Azure Databricks after error in writing with null column without type cast

I am using pyspark in Azure Databricks. I had attempted to write a delta table with null column created as follows: df = df.withColumn('val2', funcs.lit(None)) using the following function def write_to_delta_table(df, fnm, tnm, path): …

asked Oct 23 '21 at 02:01

GreenEye

vote

4 answers

How to get all parameters related to a Databricks job run into python?

I am trying to get all parameters related to a Databricks job and import them into python. These parameters should include the date, start time, duration, Status of the job(successful or failed) and all other parameters related to it. I want to use…

python databricks azure-databricks

asked Oct 21 '21 at 10:50

Yatharth Kaushik

vote

1 answer

How do I add NULL column to a new table based on a existing delta table while using SQL databricks?

I tried to make a new table from a delta table and adding a new NULL column while using using SQL databricks. Databricks is not able to make a NULL column, if i fill the newly made column it works fine. How do I add NULL column to a new table based…

azure-sql-database databricks azure-databricks delta-lake databricks-sql

asked Oct 21 '21 at 09:25

Ruero

vote

1 answer

How do I efficiently migrate MongoDB to azure CosmosDB with the help of azure Databricks?

While searching for a service to migrate our on-premise MongoDB to Azure CosmosDB with Mongo API, We came across the service called, Azure Data Bricks. We have total of 186GB of data. which we need to migrate to CosmosDB with less downtime as…

mongodb azure azure-cosmosdb database-migration azure-databricks

asked Oct 21 '21 at 06:10

Jithin Variyar

vote

1 answer

How to ingest data from Eventhub to ADLS using Databricks cluster(Scala)

I'm want to ingest streaming data from Eventhub to ADLS gen2 with specified format. I did for batch data ingestion, from DB to ADLS and Container to Container but now I want to try with streaming data ingestion. Can you please guide me from where to…

apache-spark azure-databricks spark-structured-streaming azure-eventhub azure-data-lake-gen2

asked Oct 21 '21 at 05:41

Sai Varun Kumar

vote

1 answer

Eventhub Stream not catching schema mismatch

We are trying to implement badRecordsPath when we are reading in events from an eventhub, as an example to try get it working I have put in schema that should fail the event: eventStreamDF = (spark.readStream .format("eventhubs") …

azure databricks azure-databricks azure-eventhub

asked Oct 19 '21 at 11:20

T.UK

vote

2 answers

Call Databricks API from DevOps Pipeline using Service principal

I want to be able to call Databricks API from DevOps pipeline. I can do this usint personal access token for my account, however I want to make API calls user independent so I wanted to use Service principal (App registration). I followed this…

azure azure-devops azure-pipelines databricks azure-databricks

asked Oct 14 '21 at 15:35

romanzdk

vote

0 answers

Linked Service from azure Data factory to Databricks: How to parametrize?

I am using new job cluster option while creating linked service from ADF (Data factory) to Databricks with spark configs. I want to parametrize the spark config values as well as keys. I know it's quite easy to parametrize values by referring this…

azure azure-devops azure-data-factory azure-databricks databricks-connect

asked Oct 13 '21 at 18:08

Aniket Karajgikar

vote

1 answer

Ingest CSV data with Auto Loader with Specific Delimiters / separator

I'm trying to load a several csv files with a complex separator("~|~") The current code currently loads the csv files but is not identifying the correct columns because is using the separator (","). I'm reading the documentation here…

apache-spark pyspark databricks azure-databricks databricks-autoloader

asked Oct 13 '21 at 15:11

Leonardo Lima

vote

0 answers

Access a function from another script in Shared folder in Azure Databricks

I am new to Azure Databricks, and have run into a situation. I have a dev_tools python script in workspace/Shared/dev_tools location. The dev_tools script contains the following code (This is an example and not the actual code). def add (first_num,…

azure azure-data-factory azure-databricks

asked Oct 12 '21 at 12:16

Ashish soni

vote

1 answer

Snowflake Table from Databricks using Python/Scala

I want to create a table and load data into it in Snowflake from Databricks using Python/Scala. Below is my code snippet. I'm getting the below error. How can I create the table first if not exists in Databricks notebook using Python or Scala and…

scala snowflake-cloud-data-platform azure-databricks

asked Oct 12 '21 at 10:28

testbg testbg

vote

1 answer

using confluent kafka-schema-registry-client with basic auth with managed confluent schema registry in databricks

in my spark application I have the following scala code val restService = new RestService(schemaRegistryUrl) val props = Map( "basic.auth.credentials.source" -> "USER_INFO", "basic.auth.user.info" -> "%s:%s".format(key, secret) ).asJava …

azure-databricks confluent-schema-registry

asked Oct 11 '21 at 19:08

alonisser

11,542
21
85
139

vote

0 answers

Databricks fails to create StructType / Schema when used against case class installed as Jar file

I am using ScalaReflection to create schema from Case Classes. I have installed the jar containing the case classes in the databricks cluster and When I invoke the following…

scala apache-spark databricks azure-databricks

asked Oct 11 '21 at 16:35

Ganesh

1,654
2
19
32

vote

1 answer

Databricks: Azure Queue Storage structured streaming key not found error

I am trying to write ETL pipeline for AQS streaming data. Here is my code CONN_STR = dbutils.secrets.get(scope="kvscope", key = "AZURE-STORAGE-CONN-STR") schema = StructType([ StructField("id", IntegerType()), StructField("parkingId",…

pyspark databricks azure-databricks spark-structured-streaming azure-storage-queues

asked Oct 11 '21 at 06:48

patryk_ostrowski

vote

3 answers

Installing Janitor library in Azure Databricks

I have Python 3.7 installed. Trying to install janitor library in Azure DataBricks. It works properly in my local machine, but have difficulty to be installed in Azure DataBricks. I run dbutils.library.installPyPI('janitor'), but got the below…

python databricks azure-databricks

asked Oct 07 '21 at 02:43

Amn Kh

Prev 1 2 3

…

99 100 Next