Questions tagged [aws-databricks]

For questions about the usage of Databricks Lakehouse Platform on AWS cloud.

Databricks Lakehouse Platform on AWS

Lakehouse Platform for accelerating innovation across data science, data engineering, business analytics, and data warehousing integrated with your AWS infrastructure.

Reference: https://databricks.com/aws

190 questions
0
votes
0 answers

Unable to access external location in Databricks on AWS

I'm unable to access one particular external location in Databricks on AWS using the databricks cli databricks unity-catalog external-locations get --name returns Error: Authorization failed. Your token may be expired or lack the valid…
0
votes
1 answer

How to insert values into a table from a list

I have a list and a table like below and what I need to do is to go through values in the item name column in the table and find the item name that is available in a list but missing from the table if there is any. then I need to insert that missing…
MMV
  • 164
  • 10
0
votes
2 answers

how to get the nearlest timestamp to a certain time?

I need to pick a data with a timestamp hourly frequency. But since sometimes timestamp is not available with the exact hour so I need to pick the data with timestamp nearest to the time. This is the dataframe I have below. | job_id| timestamp …
MMV
  • 164
  • 10
0
votes
0 answers

Is there any restriction on no.of days of server logs stored when enabled on a s3 bucket?

I have a bucket to which the server logs are enabled. And the server logs are stored in a destination path in any other s3 bucket.(not necessarily inside the same bucket) I have been analysing it from more than a month. And i observed that server…
0
votes
1 answer

Can we execute a Databricks Notebook from Informatica

There is an Informatica workflow which consists of multiple source systems. We are trying to migrate one of Sources to Databricks and would like to execute the databricks job from Informatica. Something similar to what we can do using Airflow but…
0
votes
0 answers

Unable to type in dropdown of dbutils.widgets.dropdown()

In the AWS databricks widgets.dropdown, I'm unable to type input in the dropdown box: I'm unable to find the reason of this type of behaviour.
0
votes
0 answers

On trying to setup IAM for databricks on AWS using terraform getting an error: MALFORMED_REQUEST

Error: cannot create mws credentials: MALFORMED_REQUEST: Failed credential validation checks: please use a valid cross account IAM role with permissions setup correctly on cross-account-role.tf line 33, in resource "databricks_mws_credentials"…
0
votes
1 answer

Difference between group and users in Databricks

In Databricks while creating a scope we can either give user level permission or can add group. What's the difference between both? Can someone from outside the workspace be included in a group?
0
votes
0 answers

Get Percentile of column Based on the Percentile table I provide

I am using Pyspak on DataBricks. I have already gotten the percentile table based on the training sample. Now, I want to use a table to get the percentile of the testing dataset. For Example, I have Column "Val1" and I have created a percentile…
0
votes
0 answers

How to perform AWS Databricks workspace migration and data backup?

I have been tasked with a Databricks workspace migration and also to backup tables on that backup workspace. I explored this solution https://github.com/databrickslabs/migrate but it does not involved backing up tables with their data. Has anyone…
Shaggy
  • 159
  • 1
  • 1
  • 7
0
votes
1 answer

How can I connect to jdbc as a streaming source in Databricks

Using the example from https://github.com/sutugin/spark-streaming-jdbc-source I've attempted to connect to a Postgres database as a streaming source in AWS Databricks. I have a cluster running: 11.3 LTS (includes Apache Spark 3.3.0, Scala 2.12) This…
0
votes
0 answers

Duplication data from streams on merge in Delta Tables

I have a source table with say following data +----------------+---+--------+-----------------+---------+ |registrationDate| id|custName| email|eventName| +----------------+---+--------+-----------------+---------+ | 17-02-2023| 2|…
0
votes
0 answers

Create a Dataflow within the PBI reporting service that connects to a view/Table built-in Azure Databricks. Is it possible?

I need to create a Dataflow within the PBI reporting service that connects to a view/Table built-in Azure Databricks. Is it possible? Can I create a data model or join tables with Power BI Service? If yes, then in which license, pro or premium?
Neh
  • 1
  • 1
0
votes
1 answer

Delta Lake Data Load Datatype mismatch

I am loading data from SQL Server to Delta lake tables. Recently i had to repoint the source to another table(same columns), but the data type is different in new table. This is causing error while loading data to delta table. Getting following…
Vaishak
  • 607
  • 3
  • 8
  • 30
0
votes
0 answers

Is it possible to run spark-submit task on databricks using archives parameter?

My problem is the following: I'm trying to run a job with spark submit task, but I have an environment to build. But for metro archives does not install my environment inside the cluster at runtime example job { "name":"my_test" ... …