Highest Voted 'aws-databricks' Questions

0

votes

1 answer

Databricks -Cannot create table The associated location is not empty and also not a Delta table

I am getting the error: Cannot create table ('hive_metastore.MY_SCHEMA.MY_TABLE'). The associated location ('dbfs:/user/hive/warehouse/my_schema.db/my_table') is not empty and also not a Delta table. I tried to overcome this by running drop table…

asked Jul 10 '23 at 18:49

Gilo

640
3
23

0

votes

0 answers

Spark ganglia report not matching databrick's cluster specifications

I have a databricks cluster on AWS, with minimum two nodes and maximum 8. Here's a picture of my cluster I have cached a dataframe, and under SparkUI on storage tab I see it's 6.7 GB So I would expect that if I go to ganglia's UI, I would see that…

amazon-web-services apache-spark amazon-ec2 aws-databricks ganglia

asked Jul 04 '23 at 23:03

Eugenio.Gastelum96

164
1
13

0

votes

0 answers

Unable to convert data to micro seconds in Databricks SQL

I have a requirement to convert a string to a timestamp, specifically in microsecond format. However, I am currently unable to convert it to microsecond precision. I can only convert the data up to millisecond precision. %sql select…

amazon-web-services databricks databricks-sql aws-databricks

asked Jul 03 '23 at 15:52

SK ASIF ALI

85
8

0

votes

1 answer

Databricks change default catalog

It seems that when I am connecting to Databricks Warehouse, it is using the default catalog which is hive_metastore. Is there a way to define unity catalog to be the default? I know I can run the query USE CATALOG MAIN And then the current session…

databricks databricks-sql aws-databricks

asked Jun 29 '23 at 14:44

Gilo

640
3
23

0

votes

1 answer

Convert Databricks notebook to .py file in workspace

The actual problem I'm trying to solve is that I'm using mkdocs/mkdocs-materials for my documentation. But that tool can't work with notebook type files. So as a clumsy workaround I'm figuring is to have an intermediate step that creates a copy of…

python databricks mkdocs aws-databricks

asked Jun 27 '23 at 19:02

Error_2646

2,555
1
10
22

0

votes

2 answers

Keep partition number reasonable, but partition dataframe such that values of a high cardinality column are in same partition

Tagging "sql" too because an answer that derives a column to partition on with sparkSql would be fine. Summary: Say I have 3B distinct values of AlmostUID. I don't want 3B partitions, say I want 1000 partitions. But I want all like values of…

sql pyspark databricks aws-databricks

asked Jun 20 '23 at 19:16

Error_2646

2,555
1
10
22

0

votes

1 answer

Unable to insert data in Postgres using Jdbc

I am attempting to insert data into a PostgreSQL database using PySpark with JDBC. However, during the data insertion process, it is unexpectedly attempting to recreate the table and producing the following output. org.postgresql.util.PSQLException:…

postgresql pyspark jdbc aws-databricks

asked Jun 20 '23 at 16:41

SK ASIF ALI

85
8

0

votes

1 answer

Databricks model deployment to AWS Sagemaker -- No module named docker error

I am trying to deploy a dummy model to AWS Sagemaker using Databricks and MLflow. According to this documentation, it builds a new MLflow Sagemaker image, assigns it a name, and push to ECR. However, when I run the following lines of code in a…

amazon-sagemaker amazon-ecr mlflow mlops aws-databricks

asked Jun 12 '23 at 16:38

nikhil

1
1

0

votes

0 answers

Unable to read Hudi file in Spark Databricks Environment

I am facing this error while running Spark in Databricks. I am trying to read Hudi file format. I’m using Hudi 0.13.0 with Databricks (12.2 LTS (includes Apache Spark 3.3.2, Scala 2.12) Trying to load a hudi data set from S3 but failed with this…

apache-spark pyspark databricks aws-databricks apache-hudi

asked Jun 12 '23 at 07:54

dude

1
3

0

votes

0 answers

RuntimeError: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation DataBricks error

I created the following model: class EquipmentEmbeddingEndpoint(mlflow.pyfunc.PythonModel): def load_context(self, context): self.identifiers_df = get_identifier_information() def predict(self, context, model_input): …

pyspark databricks mlflow mlops aws-databricks

asked Jun 07 '23 at 15:41

nikhil

1
1

0

votes

0 answers

Replicating Table "Promotion" in Databricks w/ S3 Backend

In my experience with DBMS systems, one safe approach to promoting new datasets for business intelligence is to: Apply updates to a staging table table_stg Validate the staging table updates against production table table_prod If pass, rename…

databricks databricks-sql aws-databricks

asked Jun 06 '23 at 14:56

David Roberts

1
1

0

votes

0 answers

AWS Databricks cluster not starting : Failed to init

Bootstrap Timeout: [id: InstanceId(i-04bd85c1b17328b96), status: INSTANCE_INITIALIZING, workerEnvId:WorkerEnvId(workerenv-2273350004125125-ce5d7a3b-cda1-4494-aa7f-c1bcea16ce04), lastStatusChangeTime: 1686032804843, groupIdOpt Some(0),requestIdOpt…

databricks aws-databricks

asked Jun 06 '23 at 08:12

Ankit Devani

5
2

0

votes

0 answers

Dropping columns from a nested array with root level array in PySpark - Databricks

How can I drop columns from a nested array in a PySpark dataframe that has an array at the root level in databricks? I was able to drop columns from an array within a struct. But not finding a way within a nested array. Not getting a solution

python pyspark aws-databricks

asked May 23 '23 at 03:39

ic2019

1
1
2

0

votes

0 answers

Column contain special charater, How to create view

I have a table which has a special character in column how to create a view create or replace view view1 as select phone# from phonebook this create statement not working in AWS databricks

aws-databricks

asked May 19 '23 at 13:39

Salman

3
2

0

votes

0 answers

How to get the AWS data-bricks cluster health metrics using api

I wanted to get some details (like memory usage, CPU) of data-bricks cluster using api. We have ganglia UI but I need to use api to get some customize metrics

metrics aws-databricks

asked May 10 '23 at 21:54

PB22

31
4

Questions tagged [aws-databricks]