Highest Voted 'aws-databricks' Questions

0

votes

1 answer

Do we have any spark libraries to connect from databricks to opensearch

`While using elastic search library "org.elasticsearch:elasticsearch-spark-30_2.12:7.13.3" which is working fine if target is elastic search 7.10 but with opensearch 2.3 as target it is giving issue like mapping parser exception. Basically while…

asked Jan 19 '23 at 12:59

ravi pavan

1

0

votes

0 answers

Can I find remaining 85-90 columns which was not provided to me? in databricks

I have a table in Databricks which has almost 100's columns. I was given with some 10-15 columns, can I find remaining 85-90 columns which was not provided to me? For example, Table 'A has column name like (a,b,c,d,e,f,g,h,....z), I was given with…

sql sql-server databricks-sql aws-databricks

asked Jan 18 '23 at 11:02

Salman

3
2

0

votes

1 answer

Configure Amazon maximum percentage of OnDemand price (spot instances)

I'm playing a little with spot instances, and for example, in Databricks, I can ask for a spot instance with a minimum of % savings over On-Demand instances. My question is, if I set 90% off the On-Demand instance and the current price is 50%, I…

amazon-ec2 aws-databricks spot-instances

asked Dec 22 '22 at 13:47

Alejandro

519
1
6
32

0

votes

1 answer

Send emails from Azure Databricks

I would like to send emails from Azure Databricks. I try to do this: https://docs.databricks.com/_static/notebooks/kb/notebooks/send-email-aws.html But when I execute this: send_email(from_addr, to_addrs, subject, html,…

email databricks azure-databricks email-attachments aws-databricks

asked Dec 13 '22 at 07:39

Guillermo Colom

9
3

0

votes

1 answer

Previous month query - Databricks

I try to find a function where I can extract the result of the last month only (for exemple if I launch the query in november, I want to display only the resultat of october) There the result : I dont know if I have to enter the function in my…

function filter databricks databricks-sql aws-databricks

asked Nov 22 '22 at 11:08

shosho88

31
2

0

votes

0 answers

Cook's distance in Pyspark

I wanted to use cooks distance to remove the outlier from my dataset for regression. But I am not able to find any method to do so in pyspark. I know how we can do it in python using get_influence() method. is there any similar method in pyspark?

apache-spark pyspark regression databricks aws-databricks

asked Nov 22 '22 at 05:54

ASD

25
6

0

votes

1 answer

Terraform + Databricks error ENDPOINT_NOT_FOUND: Unsupported path:

I am wondering if someone already encountered this error I am getting when trying to create OBO Tokens for Databricks Service Principals. When setting up the databricks_permissions I get: Error: ENDPOINT_NOT_FOUND: Unsupported path:…

terraform databricks aws-databricks terraform-provider-databricks

asked Oct 31 '22 at 17:52

Rafa

443
5
14

0

votes

0 answers

Spark Memory Management Calculation

I am new in Spark application. I am using r5a.4xlarge aws cluster with min worker is 1 and max worker is 16. This instance has 128GB memory and 16 cores. I have used spark.executor.cores 5. As per the memory management calculation memory/ executor…

apache-spark pyspark aws-databricks

asked Oct 20 '22 at 13:50

SK ASIF ALI

85
8

0

votes

0 answers

How to successfully execute a stored procedure in Databricks versions higher than 7.3

Databricks will soon be dropping support for their 7.3 LTS runtime. Unfortunately, not all the functionality (that we require) appears to be easy to replicate in later runtimes. The main sticking point that we've ran across so far is forming SQL…

sql-server pyspark odbc databricks aws-databricks

asked Oct 19 '22 at 09:09

SonofaBleepBloop

37
1
6

0

votes

0 answers

java.lang.ClassNotFoundException: org.graphframes.GraphFramePythonAPI Error in Databricks

I am getting this error on the Community Edition of Databricks when trying to make a graph with the GraphFrame() function. java.lang.ClassNotFoundException: org.graphframes.GraphFramePythonAPI enter image description here I have tried a few…

graphframes aws-databricks

asked Oct 18 '22 at 02:30

Jacob Vandergriff

1

0

votes

1 answer

Error cluster launch: Security Daemon Registration

I have created a workspace in AWS Databricks with private link. When we launch a cluster we get the following error: Security Daemon Registration Exception: Failed to set up the spark container due to an error when registering the container to…

amazon-web-services linux-kernel databricks aws-databricks aws-private-link

asked Oct 10 '22 at 10:11

jonro

1

0

votes

2 answers

Left Joining after Case Statment SQL

I have two tables, A and B In the table A there's one column with a Full Name called EmployeeName, on the table B there's also one column with the name OrigFullName, the thing is the column EmployeeName don't follow a standard, sometimes there's…

sql apache-spark aws-databricks

asked Oct 08 '22 at 04:03

sohrenan

11
2

0

votes

0 answers

Pyspark DataFrame - Discretize the selected numerical column and then apply groupby and crosstab function

I have dataframe which has 100+ numerical columns. I want to descretize some columns from it and then apply groupby function and crosstab function on these discretized columns. Currently, I am using a loop to iterate over all selected numerical…

pyspark apache-spark-sql databricks aws-databricks

asked Oct 06 '22 at 07:56

ASD

25
6

0

votes

0 answers

Databricks with CloudWatch metrics without Instanceid dimension

I have jobs running on job clusters. And I want to send metrics to the CloudWatch. I set CW agent followed this guide. But issue is that I can't create useful metrics dashboard and alarms because I always have InstanceId dimension, and InstanceId is…

amazon-cloudwatch aws-databricks

asked Sep 28 '22 at 14:29

CoyoteKG

45
5

0

votes

2 answers

Count function on dataBricks provide different output every time I run the code

I am new to data bricks and working on pyspark dataframe. In my code, I have join the two dataframe by using join function and then I use the count function to get the count of new dataframe. Then I sort the dataframe by using orderby function and…

pyspark databricks azure-databricks aws-databricks

asked Sep 28 '22 at 12:06

ASD

25
6

Questions tagged [aws-databricks]