Highest Voted 'databricks-community-edition' Questions

1

vote

1 answer

SparkClassNotFoundException: [DATA_SOURCE_NOT_FOUND] Failed to find data source: dbc

I am using DataBricks Community Edition Here is the code: code It seems tht Spark cannot read or process the .dbc file format. I have this error: org.apache.spark.SparkClassNotFoundException: [DATA_SOURCE_NOT_FOUND] Failed to find data source: dbc.…

asked Aug 12 '23 at 07:19

MBC

15
2

1

vote

0 answers

How to prevent pyspark to read parquet file header record as just another row instead of reading it as header?

I have a parquet file with 11 columns. I tried executing below ways in pyspark to read the file. It still assigns header names like Prop_0, Prop_1, Prop_2 instead of reading the starting header as header…

pyspark parquet databricks-community-edition

asked May 02 '23 at 15:57

moonchild

11
1

1

vote

1 answer

Set Workflow Job Concurrency Limit in Databricks

I need a job to be triggered every 5 minutes. However, if that job is already running, it must not be triggered again until that run is finished. Hence, I need to set the maximum run concurrency for that job to only one instance at a time. What…

databricks azure-databricks aws-databricks databricks-community-edition

asked Mar 15 '23 at 14:42

bda

372
1
7
22

1

vote

1 answer

Differnce between Spark cluster provided in DataBricks Community Edition and Master = local[8} mentioned in Spark?

I am using DataBricks Community Edition and the cluster on which my notebook is running is showing: that it has a driver with 15 gb memory and 2 cores. Whereas when I get the Spark config in my notebook , it shows ; Why is it still showing…

apache-spark pyspark databricks cluster-computing databricks-community-edition

asked Feb 21 '23 at 05:42

Karan Dhar

47
4

1

vote

1 answer

Issue with multi-column In predicates are not supported in the DELETE condition

I am using spark2.4.5 with java8 in my spark job which writes data into an s3 path. Due to multiple triggers of job accidentally, it created duplicate records. I am trying to remove the duplicates from s3 path using databricks. While i am trying to…

amazon-s3 apache-spark-sql azure-databricks databricks-sql databricks-community-edition

asked Nov 29 '22 at 08:07

Shasu

458
5
22

1

vote

1 answer

Generated/Default value in Delta table

I'm trying to set default values to column in Delta Lake table, for example: CREATE TABLE delta.dummy_7 (id INT, yes BOOLEAN, name STRING, sys_date DATE GENERATED ALWAYS AS CAST('2022-01-01' AS DATE), sys_time TIMESTAMP) USING DELTA; Error in…

databricks delta-lake databricks-community-edition

asked Sep 20 '22 at 17:31

Luis Estrada

371
7
20

1

vote

1 answer

Get cluster metric (Ganglia charts) of all clusters via REST API in Databricks

The question is specific to databricks. Is there any API to get the ganglia chart showing cluster usage? Need to get all the Ganglia charts that are available in the Databricks cluster metrics section for all the clusters via REST API calls. We are…

databricks databricks-connect aws-databricks databricks-sql databricks-community-edition

asked Aug 26 '22 at 20:01

Scorpio

511
4
14

1

vote

2 answers

Passing DataFrame from notebook to another with pyspark

i'am trying to call a DataFrame that i created in notebook1 to use it in my notebook2 in Databricks Community addition with pyspark and i tried this code dbutils.notebook.run("notebook1", 60, {"dfnumber2"}) but it shows this…

dataframe pyspark databricks databricks-community-edition

asked Aug 08 '22 at 12:26

BENOTH7

27
5

1

vote

0 answers

How do I import a local >2GB JSON file into Databricks Community Edition?

When I try to do it through their UI, I receive an error saying that the file size is too large. Are there any other ways to do this other than through Databrick's UI?

json import databricks-community-edition

asked Jun 09 '22 at 17:40

James Manson

11
1

1

vote

0 answers

with open to read json not working on databricks

I was creating a function to write to MongoDB atlas, and I could not open the json file from the dbfs/FileStore. I did research on this but it seems like it is a community edition issue and none of the examples I found worked. I was wondering if…

python json pyspark mongodb-atlas databricks-community-edition

asked May 11 '22 at 05:49

reksapj

101
7

1

vote

1 answer

Preprocessing large data in databricks community edition

I have 16 GB dataset and want to use it in databricks. However, in community edition DBFS limit is 10 GB. May you please assist me to preprocess the data to be able to move it from driver to DBFS.

dataset databricks large-data databricks-community-edition

asked Apr 19 '22 at 18:22

Shihab Masri

21
2

1

vote

1 answer

Unable to access files uploaded to dbfs on Databricks community edition Runtime 9.1. Tried the dbutils.fs.cp workaround which also didn't work

I'm a beginner to Spark and just picked up the highly recommended 'Spark - the Definitive Edition' textbook. Running the code examples and came across the first example that needed me to upload the flight-data csv files provided with the book. I've…

apache-spark databricks databricks-community-edition

asked Mar 25 '22 at 02:36

LearneR

2,351
3
26
50

1

vote

1 answer

Can You Persist a Model in Databricks Community Edition?

Is there a way to persist a Python machine learning model when using the free Databricks community edition? It looks like the DBFS is not available. This means that I can't use tools like joblib to save the model in the file system. ML Flow is not…

python machine-learning databricks databricks-community-edition

asked Mar 10 '22 at 23:57

Trey

201
3
14

1

vote

0 answers

Building an API around Databricks Notebook

I'm very new to the Databricks community platform. I have recently developed an ML model using databricks and would like to productionize it using a Swagger API. I have tried it in bits and pieces but can't figure it out at all. Can someone please…

databricks databricks-community-edition

asked Mar 06 '22 at 18:55

Fahad

11
2

1

vote

1 answer

Unable to create feature table on databricks

from pyspark.sql import SparkSession, Row from datetime import date spark = SparkSession.builder.getOrCreate() tempDf = spark.createDataFrame([ Row(date=date(2022,1,22), average=40.12), Row(date=date(2022,1,23), average=41.32), …

python apache-spark databricks-community-edition feature-store

asked Mar 04 '22 at 12:20

user22

112
1
9

Questions tagged [databricks-community-edition]