Questions tagged [databricks-connect]
172 questions
0
votes
1 answer
Run a spark scala script in a remote cluster setup on Azure databricks
I have written a spark scala(sbt) application in intelliJ which I want to run on a remote cluster hosted on Azure databricks. What all steps to follow to avoid manually uploading jars into dbfs always to test the code

Sharad R. Telkar
- 45
- 9
0
votes
1 answer
StackOverflowError while calling collectToPython when running Databricks Connect
I am running a PySpark application on a remote cluster with DataBricks Connect. I'm facing a problem when trying to retrieve the minimum value of a column when another column has a certain value. When running the following line:
feat_min =…

Mircea Stoica
- 11
- 1
0
votes
0 answers
Connect to Databricks through ODBC without downloading driver
I need to connect to Databricks in order to run queries from my .NET app. I'd like to avoid the Rest API approach and use ODBC but I saw that, in order for the ODBC approach to work, I'd need to download an ODBC driver (Simba Spark). So, can I…

anthino12
- 770
- 1
- 6
- 29
0
votes
0 answers
Databricks connect does not work from intellj?
I am trying to use databricks connect to run the spark job on databricks cluster from intellj .I followed the below link documentation.
https://docs.databricks.com/dev-tools/databricks-connect.html
However I could not make it work with intellj and…

Rajesh Kumar Dash
- 2,203
- 6
- 28
- 57
0
votes
1 answer
Spark create dataframe in IDE (using databricks-connect)
I'm attempting to run some code from my databricks notebook in an IDE using databrick connect. I can't seem to figure out how to create a simple dataframe.
Using:
import spark.implicits._
var Table_Count =…

steven hurwitt
- 183
- 2
- 15
0
votes
1 answer
How to store Databricks token created from CLI in Yaml Bash step
I have the following Yaml script. I am looking for how to grab the token created and store into a variable:
- bash: |
echo {} > ~/.databricks-connect#
source py37-venv/bin/activate
pip3 install wheel
pip3 install…

ibexy
- 609
- 3
- 16
- 34
0
votes
1 answer
Databrick connection to ADLS Gen2 text files
I am using databricks to access my ADLS Gen2 container.
dbutils.fs.mount(
source = "wasbs://@.blob.core.windows.net",
mount_point = "/mnt/",
extra_configs =…

Andy
- 11
- 3
0
votes
0 answers
Is it possible to use Databricks-Connect along with Github to make changes to my Azure Databricks notebooks from an IDE?
My aim is to make changes to my Azure Databricks notebooks using an IDE rather than in Databricks. While at the same time implementing some sort of version control.
Reading the Databricks-Connect documentation this doesn't look like it supports this…

Iqram Choudhury
- 29
- 4
0
votes
1 answer
add column to existing dataframe from widgets values using pyspark
I have a dataframe where i need to add a column from the widget value that is being passed. I am trying the below code but its not helping in anyways. When we display(pdf) we should also see the ID column has also been added.
…

batman_special
- 115
- 1
- 2
- 10
0
votes
1 answer
Error debugging PySpark after upgrading cluster's Databricks Runtime
I have updated an Azure Databricks cluster from runtime 5.5LTS to 7.3LTS. Now I'm getting an error when I debug in VSCode. I have updated my Anaconda connection like this:
> conda create --name dbconnect python=3.7
> conda activate dbconnect
> pip…

Connell.O'Donnell
- 3,603
- 11
- 27
- 61
0
votes
0 answers
Databricks-connect Java connects to local instead of remote
I have a Java application that connects to an Apache Spark cluster and performs some operations. I'm trying to connect to a Databricks cluster on Azure, using databricks-connect 7.3. If I run from the terminal databricks-connect test, everything…

phcaze
- 1,707
- 5
- 27
- 58
0
votes
2 answers
Using DataBricks API 2.0 with Tokens
I'm trying to hit DataBricks API 2.0 using Bearer Tokens and I'm getting 200 response but results are not showing.
I'm running this command,
curl -H @{'Authorization' = 'Bearer '} https://DataBricks Instance Here/api/2.0/clusters/list

Abdul Haseeb
- 442
- 4
- 22
0
votes
1 answer
How to get the Run id's using Job Id using Databricks CLI
I tried to get the Run id using databricks runs list on CLI but didn't got the Run id's of all the jobs that run's everyday i got only top 20 Run id's but then i got the Job id of all jobs using databricks jobs list --output json now i want to get…

ishwar
- 298
- 5
- 16
0
votes
0 answers
Is there any way to unnest bigquery columns in databricks in single pyspark script
I am trying to connect bigquery using databricks latest version(7.1+, spark 3.0) with pyspark as script editor/base language.
We ran a below pyspark script to fetch data from bigquery table to databricks
from pyspark.sql import SparkSession
spark =…

Harini
- 21
- 4
0
votes
1 answer
How to monitor Databricks jobs using CLI or Databricks API to get the information about all jobs
I want to monitor the status of the jobs to see whether the jobs are running overtime or it failed. if you have the script or any reference then please help me with this. thanks

ishwar
- 298
- 5
- 16