Questions tagged [azure-databricks]

For questions about the usage of Databricks Lakehouse Platform on Microsoft Azure

Overview

Azure Databricks is the Azure-based implementation of Databricks, which is a high-level platform for working with Apache Spark and includes Jupyter-style notebooks.

Azure Databricks is a first class Azure service and natively integrates with other Azure services such as Active Directory, Blob Storage, Cosmos DB, Data Lake Store, Event Hubs, HDInsight, Key Vault, Synapse Analytics, etc.

Related Tags

4095 questions
1
vote
1 answer

Does CLONE TABLE in Databricks delete the target table before cloning?

The documentation does not make it clear and we can't adequately test this: Does CREATE OR REPLACE TABLE 'x' DEEP CLONE 'y' synchronize two pre-existing Delta tables or does it delete the target and recreate it from the source?
Jerry Nixon
  • 31,313
  • 14
  • 117
  • 233
1
vote
1 answer

Uploading data to Azure eventhub on daily basis

I have a job scheduler which is running daily in Azure Databricks notebook and output generated to a parquet file in Databricks. I am creating Azure Eventhub where daily output of the parquet table will be uploaded. My question is lets say on day1…
Saswat Ray
  • 141
  • 3
  • 14
1
vote
1 answer

I/O error while accessing file:/dbfs/my_hyper.hyper: SIGBUS

I'm trying to write Tableau's .hyper file to a directory in Databricks. However it yields The database "hyper.file:/dbfs/my_hyper.hyper" could not be created: I/O error while accessing file:/dbfs/my_hyper.hyper: SIGBUS Why is this happening? I face…
The Singularity
  • 2,428
  • 3
  • 19
  • 48
1
vote
1 answer

Databricks - read table from Snowflake to Databricks

I've seen a few questions on Databricks to Snowflake but my question is how to get a table from Snowflake into Databricks. What I've done so far: Created a cluster and attached the cluster to my notebook (I'm using Python) # Use secrets DBUtil to…
1
vote
1 answer

Tricky upsert in Delta table using spark

I have a case class as follows: case class UniqueException( ExceptionId:String, LastUpdateTime:Timestamp, IsDisplayed:Boolean, …
Ganesha
  • 145
  • 1
  • 10
1
vote
1 answer

what are the events (ex MODEL_VERSION_CREATED) associated with ML FLow Databricks CI/CD

ML Flow has multiple events to subscribe like MODEL_VERSION_CREATED when a model version is created. what are the other events available to subscribe.
om1042
  • 11
  • 2
1
vote
1 answer

XLRDError: Excel xlsx file; not supported Databricks

I'm using Azure Databricks and trying to read an excel file. I have an encrypted file with .xlsx.pgp. After decrypting the message I get it as a byte array. So, here's the function I use to read this file as a pandas dataframe: df =…
jukebox
  • 453
  • 2
  • 8
  • 24
1
vote
1 answer

How to extract Azure Application Insights Events using Pyspark?

I'm trying to capture Azure Application Insights event in structured format using the below code in Pyspark (Azure Databricks) - import requests import json appId = "..." appKey = "..." query = """traces | where timestamp > ago(1d) | order by…
1
vote
1 answer

How to read Azure Databricks output using API or class library

I have Azure Databrick notebook which contain SQL command. I need to capture output of SQL command and use in Dot Net core. Need help.
1
vote
1 answer

Microsoft-Graph: Failing to get token from python code: Error SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED]

I need to call a web API. For that I need a bearer token. I am using databricks(python) code to first get authenticated over Microsoft AAD. Then get bearer token for my service_user. I Followed the microsoft docs docs But facing problem where it…
1
vote
0 answers

Hide Azure Databricks logs

I am running code locally on my computer, which uses Azure Databricks cluster. Because of this I am getting a lot of "View job details at https://adb......" statements. I am logging other things, so I cannot turn off logging itself. How can I remove…
1
vote
2 answers

How do we access a file in github repo inside our azure databricks notebook

We have a requirement where we need to access a file hosted on our github private repo in our Azure Databricks notebook. Currently we are doing it using curl command using the Personal Access Token of a user. curl -H 'Authorization: token…
1
vote
1 answer

Event Hub: org.apache.spark.sql.AnalysisException: Required attribute 'body' not found

I am trying to write change data capture into EventHub as: df = spark.readStream.format("delta") \ .option("readChangeFeed", "true") \ .option("startingVersion", 0) \ .table("cdc_test1") While writing to azure eventhub it expects the content…
1
vote
0 answers

How to pass pipeline data to azure ml pipeline databricks step?

I have created an Azure ml pipeline consisting of 4 steps. First, two steps are python script steps and the 3rd one is databricks step and 4th one is also python script step. I am creating a pipeline data and passing it to all subsequent steps. …
1
vote
1 answer

Writing from pandas dataframe to DataBricks database table

I have a database table in Azure DataBricks that already has data in it - I need to append data to that table. I have my pandas dataframe (df_allfeatures) that I want to append to my database The function that I use to write to my database…
kdp132
  • 11
  • 1
  • 4