Questions tagged [databricks-community-edition]
85 questions
0
votes
0 answers
Azure Databricks community edition COPY INTO command error
I have been getting the following error upon running the COPY INTO command in Databricks Community edition:
Error in SQL statement: UnsupportedOperationException: com.databricks.backend.daemon.data.client.DBFSV1.createAtomicIfAbsent(path:…

SAR182
- 11
- 3
0
votes
1 answer
why do I have a label Problem when using Crossvalidator
I'm new to spark :) I try to use CrossValidator. My model is as follows :
training
#training data - several repartition have been tested, 50/50 seems the best
(trainData, testData) = modelData.randomSplit([0.5, 0.5])
#counting data…

CenturyGhost
- 11
- 4
0
votes
1 answer
how to calculate multiple elements sum and average with RDD
I am a big newbie in pyspark. have organized a RDD with the following code:
rdd1 = labRDD.map(lambda kv: (kv[0].split("/")[-1].split('.')[0], kv[1]))
rdd2 = rdd1.flatMapValues(lambda v: v.split('\r\n'))
rdd3 = rdd2.map(lambda kv: (kv[0],…

CenturyGhost
- 11
- 4
0
votes
1 answer
How to read a csv file from an FTP using PySpark in Databricks Community
I am trying to fetch a file using FTP (kept on Hostinger) using Pyspark in Databricks community.
Everything works fine until I try to read that file using spark.read.csv('MyFile.csv').
Following is the code and an error,
PySpark…
0
votes
1 answer
Converting SQL Query to Databricks SQL
I have a query that I need to convert to Databricks SQL or run against a table in a Databrick environment but failing even though it works very well against tables SQL Server. The tables and query can be found here
The query to convert or run in…

UpwardD
- 739
- 4
- 12
- 36
0
votes
1 answer
Could anyone help me with these two SQL queries on my Items table?
I made a SQL project for myself to learn.
I am trying to answer these two questions:
How many total items were sold between 1970 and 2000?
What state in the US had the most items sold overall?
I am using the databricks community edition. I am…

Cyber Chick
- 1
- 1
0
votes
1 answer
DataBricks: notebook : Python: FileNotFoundError
I run the following code in DataBricks: notebook and get FileNotFoundError
import pandas as pd
df = pd.read_csv ('E:\Myfolder1\Myfolder2\Myfolder3\myfile.csv')
print(df)
FileNotFoundError: [Errno 2] No such file or directory:…

Anson
- 243
- 1
- 5
- 20
0
votes
2 answers
In self join why one table giving null value?
I'm using databricks community edition. I created a temporary view.
%python
df.createOrReplaceTempView("athlete_events_csv")
The query i'm writing
with medal_count_by_country as
(SELECT NOC, Year, count(*) as medal_count, row_number() over(…

s c
- 11
- 2
0
votes
2 answers
Append dynamic multiple lines header in to an .txt file which as data from data bricks
I am trying to load data in to a abc.txt file form an .csv file which is stored in delta lake.
Example : Data load with | separation in abc.txt file
id|name|address|contact_no
1|abc|xyz1|123
2|efg|xyz2|456
3|hij|xyz3|789
4|klmn|xyz4|91011
Header…

Sai Kiran
- 1
- 1
0
votes
0 answers
I cannot connect from my cloud kafka to databricks community edition's spark cluster
1- I have a spark cluster on databricks community edition and I have a Kafka instance on GCP.
2- I just want to data ingestion Kafka streaming from databricks community edition and I want to analyze the data on spark.
3-
This is my connection…

Tugrul Gokce
- 160
- 8
0
votes
1 answer
Cannot apply count() or collecr() on RDD from textfile(Spark)
I am new at Spark and I have Databricks Community Edition account. Right now I'm doing Lab and encountered with following error:
!rm README.md* -f
!wget https://raw.githubusercontent.com/carloapp2/SparkPOT/master/README.md
textfile_rdd =…

Renat Abdrakhmanov
- 15
- 6
0
votes
1 answer
Databricks, SPARK UI, sql logs: Retrieve with REST API
Is it possible to retrieve Databricks/Spark UI/SQL logs using the rest-API, any retention limit?, can't see any related API rest-api azure Databricks
Note: cluster /advanced options/logging has not been set.

binar
- 14
- 2
0
votes
0 answers
How to find out and replace String with Nested Array in CSV using Scala
I have a requirement as need to load the CSV and find & replace string with nested array using Databricks with scala.
Can you please help me on this.
Regards,
Ram

Ram
- 1
0
votes
0 answers
Runtime duration timeout - databricks
Databricks environment -
I'm trying to add a tabel(CSV file) in my notebook which is connected successfully to a cluster.
But midway uploading the day an error message is showing which says 'couldn't upload, runtime duration timeout'
How do I fix…

silbia
- 37
- 4
0
votes
1 answer
how to connect to mongodb Atlas from databricks cluster using pyspark
how to connect to mongodb Atlas from databricks cluster using pyspark
This is my simple code in notebook
from pyspark.sql import SparkSession
spark = SparkSession \
.builder \
.appName("myApp") \
.config("spark.mongodb.input.uri",…

Arnab Mandal
- 21
- 5