Questions tagged [aws-databricks]

For questions about the usage of Databricks Lakehouse Platform on AWS cloud.

Databricks Lakehouse Platform on AWS

Lakehouse Platform for accelerating innovation across data science, data engineering, business analytics, and data warehousing integrated with your AWS infrastructure.

Reference: https://databricks.com/aws

190 questions
0
votes
2 answers

Databricks AWS account setup - AWS storage with error - Missing permissions: PUT, LIST, DELETE

I have created a PREMIUM trail Databricks account with AWS. I have setup AWS account with user access keys. And for configuring AWS storage followed the below instructions in the URL(setup bucket policy as below in below URL). { "Version":…
0
votes
2 answers

Using DataBricks API 2.0 with Tokens

I'm trying to hit DataBricks API 2.0 using Bearer Tokens and I'm getting 200 response but results are not showing. I'm running this command, curl -H @{'Authorization' = 'Bearer '} https://DataBricks Instance Here/api/2.0/clusters/list
0
votes
1 answer

In a Scala notebook on Apache Spark Databricks how do you correctly cast an array to type decimal(30,0)?

I am trying to cast an array as Decimal(30,0) for use in a select dynamically as: WHERE array_contains(myArrayUDF(), someTable.someColumn) However when casting with: val arrIds = someData.select("id").withColumn("id", col("id") …
0
votes
2 answers

AWS glue: Deploy model in aws environment

As per our AWS environment , we have 2 different types SAGs( service account Group) for Data storage. One SAG is for generic storage , another SAG for secure data which will only hold PII or restricted data. In our environment, we are planning to…
0
votes
1 answer

How to access key value from AWS Key Management Service in data bricks

I am creating a solution in AWS data bricks and wanted to access the userID and password of RDS from AWS KMS. Anyone has created this scenario please help.
Gaurav Gangwar
  • 467
  • 3
  • 11
  • 24
0
votes
1 answer

Get classname of the running Databricks Job

There is an Apache Spark Scala project (runnerProject) that uses another one in the same package (sourceProject). The aim of the source project is to get the name and version of the Databricks job that is running. The problem with the following…
Eve
  • 604
  • 8
  • 26
0
votes
1 answer

Too many files on my Databricks Community cluster, but where?

I started playing with streaming on my Community Edition Databricks but after some minutes of producing test events I encountered some problem. I believe it's somehow connected with the fact of some temporary small files produced during streaming…
-1
votes
2 answers

Use of '\' in reading dataframe

# File location and type file_location = "/FileStore/tables/FileName.csv" file_type = "csv" #CSV options infer_schema = "true" first_row_is_header = "true" delimiter = "," # The applied options are for CSV files. For other files types, these will…
-2
votes
1 answer

Using Spark Connector for Databricks and Snowflake on AWS

I'm looking at using both Databricks and Snowflake, connected by the Spark Connector, all running on AWS. I'm struggling to understand the following before triggering a decision: How well does the Spark Connector perform? (performance, extra costs,…
1 2 3
12
13