Highest Voted 'aws-databricks' Questions

1

vote

1 answer

How to import text file in Data bricks

I am trying to write text file with some text and loading same text file in data-bricks but i am getting error Code #write a file to DBFS using Python I/O APIs with open("/dbfs/FileStore/tables/test_dbfs.txt", 'w') as f: f.write("Apache Spark is…

asked May 27 '21 at 05:00

Suneel Kumar

127
1
2
8

1

vote

1 answer

poetry publish from codebuild to aws codeartifact fails with UploadError

I have a dataset I need to periodically import to my datalake, replacing current dataset After I produce a dataframe I currently do: df.write.format("delta").save("dbfs:/mnt/defaultDatalake/datasets/datasources") But if I run the job again I get…

apache-spark databricks delta-lake aws-databricks

asked Apr 09 '21 at 14:30

alonisser

11,542
21
85
139

1

vote

1 answer

How to access AWS public dataset using Databricks?

For one of my classes, I have to analyze a "big data" dataset. I found the following dataset on the AWS Registry of Open Data that seems interesting: https://registry.opendata.aws/openaq/ How exactly can I create a connection and load this dataset…

amazon-web-services dataset databricks aws-databricks

asked Apr 02 '21 at 05:34

Aspire

397
1
3
9

1

vote

1 answer

Cannot import CSV file into h2o from Databricks cluster DBFS

I have successfully installed both h2o on my AWS Databricks cluster, and then successfully started the h2o server with: h2o.init() When I attempt to import the iris CSV file that is stored in my Databricks DBFS: train, valid =…

python-3.x databricks importerror h2o aws-databricks

asked Dec 20 '20 at 21:13

marv722

67
3

1

vote

1 answer

How to access the AWS public dataset using Databrick?

I am new to databricks. I am looking for public big data dataset for my school project, then I came across AWS public dataset on this link: https://registry.opendata.aws/target/ I am using python on Databricks, and I don't know how to establish a…

apache-spark dataset databricks aws-databricks

asked Oct 11 '20 at 19:05

kimhkh

27
4

1

vote

1 answer

How can I set spark.task.maxFailures on AWS databricks?

I would like to set spark.task.maxFailures to value more than 4. Using Databricks 6.4 runtime, how can I set this value? When I execute spark.conf.get("spark.task.maxFailures"), I get below error java.util.NoSuchElementException:…

apache-spark databricks aws-databricks

asked Sep 16 '20 at 10:08

ravi malhotra

703
5
14

0

votes

1 answer

Databricks Job API via Python "Run settings must be specified"

In databricks I've manually created a DAG job-of-jobs (task type Run job) that executes several sub-jobs. When I manually run it, it works well, and I can see it executing the sub-jobs to completion in the run. The issue is that I want to actually…

api databricks directed-acyclic-graphs aws-databricks

asked Aug 25 '23 at 22:03

revitev

3
2

0

votes

1 answer

Error: Spark driver stopped unexpectedly due to memory

I have the below code where I need to reuse the flag from the previous day. So I am running the loop. I can't use the offset here as once I know the flag from the previous day then only I can use it for today. So, this loop runs 1000 times and after…

python apache-spark pyspark databricks aws-databricks

asked Aug 16 '23 at 18:23

ASD

25
6

0

votes

1 answer

Read Kafka store file location from S3

We are getting below error so we started to get kafka key and certificate from S3 location (s3://my-bucket/tmp/k2/truststore.jks) on databricks notebook DbxDlTransferError: Terminated with exception: Kafka store file location only supports external…

python amazon-s3 apache-kafka databricks aws-databricks

asked Aug 12 '23 at 00:58

Arvind Pant

401
5
9

0

votes

0 answers

How to transfer all the ML Scripts, models etc. from databricks to AWS Sagemaker

My databricks is hosted in AWS and i want to transfer all the notebooks, models etc from databricks to sagemaker. Can anyone tell me the procedure to be followed?

amazon-sagemaker aws-databricks

asked Aug 10 '23 at 13:37

Harry1234

21
1

0

votes

1 answer

Cannot create a Metastore in Databricks

Followed all steps from https://www.youtube.com/watch?v=cylJ9hPmt7c , but still getting an error here is an example. Can't figure out why. Tried different regions also. I have account admin on databricks console and admin in aws, so it not the…

databricks aws-databricks databricks-unity-catalog

asked Aug 04 '23 at 14:48

Data Therapist

3
3

0

votes

2 answers

How to control databricks autoscaling from the driver node

I am using databricks for a specific workload. This workload involves an approx of 10 to 200 dataframes that are read and written to a storage location. This workload can benefit from parallelism. The constraint i have is cost optimization.…

amazon-web-services apache-spark pyspark databricks aws-databricks

asked Jul 31 '23 at 13:59

The indian guy

27
6

0

votes

1 answer

Bigquery DATABRICKS CONNECTIVITY

How to access data from Big Query to dataframe in databricks using credentials in secrets. df = spark .read .format("bigquery") .option("credentialsFile",credentialfilepath) .option("parentProject",projectName) …

google-bigquery databricks aws-databricks secrets

asked Jul 21 '23 at 18:59

crazy DE

1

0

votes

0 answers

Databricks to ElasticSearch error [scala/Product$class]

When we are trying below code to push data from Databricks to ElasticSearch and we are getting below error. Elastic Search version 6.1.3 Databricks runtime version: Added below jars in cluster…

scala apache-spark elasticsearch databricks aws-databricks

asked Jul 19 '23 at 21:56

user1398291

183
1
2
15

0

votes

1 answer

Databricks AWS compute cluster location

I have hosted Databricks on top of AWS, but I cannot see any EC2 instance created for Databricks. Can anyone explain me, if I create databricks using aws in my VPC, will the computation be created outside my AWS VPC? If yes, where will the…

amazon-web-services databricks aws-databricks

asked Jul 13 '23 at 14:04

Ninad Magdum

37
4

Questions tagged [aws-databricks]