Questions tagged [aws-databricks]

For questions about the usage of Databricks Lakehouse Platform on AWS cloud.

Databricks Lakehouse Platform on AWS

Lakehouse Platform for accelerating innovation across data science, data engineering, business analytics, and data warehousing integrated with your AWS infrastructure.

Reference: https://databricks.com/aws

190 questions
0
votes
2 answers

Unable to execute Databricks REST API for data copy using Python

When i am executing the below code to "copy data from databricks --> local" its failing with an error. Can anyone please help me with how to solve this error. import os from databricks_cli.sdk.api_client import ApiClient from…
0
votes
1 answer

Databricks write config file to dbfs:/

Is it possible to write/modify (aka vim) file at dbfs:/ using databricks-cli or UI? I don't have an option to write this file from the notebook directly. Example: I need to create .ini file, put properties here, and later read them from a notebook.
calm27
  • 145
  • 6
0
votes
1 answer

why in Databricks the last part of running takes a lot of time?

I am using Databricks to create an algorithm for big data. I am wondering why the last 1% of my running process takes a lot of time? I am writing the result in S3, the result for 111991 data (out of 116367) is done in 5 minutes and just for the last…
0
votes
1 answer

AWS subnet’s using terraform

I have a set of subnet’s how do i assign the subnets which are available automatically using terraform? Ex : [“subnet-a”,”subnet-b”,”subnet-c”, “subnet-d”] I want to pick the two available subnets from given set for module A and module B?
0
votes
1 answer

how to pass values to a command in Linux if it is asking yes and other parameters

I am working on connecting to a data bricks workspace from databricks-connect command using bash script i have tried the following command to configure echo "y $(databricks url) $(token) $(cluster_id) $(org_id) $(port)" | databricks-connect…
0
votes
0 answers

Spark jdbc oracle stuck in one task

I'm running the query below in oracle: df = spark.read \ .format("jdbc") \ .option("url", "{}".format(db_url)) \ .option("dbtable","({})".format(query)) \ .option("user","{}".format(db_username)) \ .option("numPartitions", 1000)…
Alan Miranda
  • 125
  • 1
  • 8
0
votes
1 answer

What's the correct way to save a dataframe into a Databricks table?

Im trying to save a big dataframe into a Databricks table to make the data persistent and avaiable to other notebooks without having to query again the data sources: df.write.saveAsTable("cl_data") and using the overwrite method too into the…
L30h
  • 3
  • 2
0
votes
0 answers

Is it possible to mount EFS on Databricks (AWS)

I have run the following code from a databricks notebook: dbutils.fs.mount('', '/mnt/efs') And I get the following error: IllegalArgumentException: Unsupported scheme: null. Allowed schemes are: gs,s3a,s3n,wasbs,adl,abfss. I have have…
Kaharon
  • 365
  • 4
  • 16
0
votes
0 answers

Django I/O web app to trigger databricks notebook, do the process and store the results on s3

I have a fully functional django web app running in windows local machine. However, I now need to deploy it in aws ec2 windows server. This is "upload - process - download" type of application. since the processing is quite heavy, I want to shift…
0
votes
1 answer

Results in databricks on AWS are not displayed when run as a job

Instead of the expected output from a display(my_dataframe), I get Failed to fetch the result. Retry when looking at the completed run (also marked as success). The notebook runs fine, including the expected outputs, when run as an on-demand…
0
votes
1 answer

loading a tab delimited text file as a hive table/dataframe in databricks

I am trying to upload a tab delimited text file in databricks notebooks, but all the column values are getting pushed into one column value here is the sql code I am using Create table if not exists database.table using text options (path…
kayd
  • 1
0
votes
1 answer

Databricks Error when Loading data to Synapse: java.lang.IllegalArgumentException: Column number mismatch

Exception encountered in Azure Synapse Analytics connector code Databricks to Synapse data load error: Caused by: java.lang.IllegalArgumentException: Column number mismatch
0
votes
1 answer

Cluster Automatically Removed from Databricks after lack of usage

Does anyone know how to retrieve a cluster that was auto removed from Databricks after not using it for some time? I added a bunch of libraries and global init scripts to it and it automatically got deleted after not using it for a month. I want to…
nak5120
  • 4,089
  • 4
  • 35
  • 94
0
votes
1 answer

Databricks Instance Profile Creation Failure - "AWS error: You are not authorized to perform this operation"

I'm trying to create a databricks instance profile for use with a previously provisioned workspace and getting the following error when running terraform apply: 2022-01-25T09:32:31.063-0800 [DEBUG] provider.terraform-provider-databricks_v0.4.4: 400…
0
votes
1 answer

timestamp conversion in databricks using date_format

I would like to convert below timestamp in databricks, Please help to get desired format select date_format(from_utc_timestamp(current_timestamp,'America/Los_Angeles'), 'MM/DD/YY HH24:MI') AS START_TIME Error: IllegalArgumentException: All…
Ashu
  • 193
  • 1
  • 16