Questions tagged [databricks-connect]

172 questions
0
votes
1 answer

ModuleNotFoundError: No module named 'databricks' in virtual environment with databricks-connect installed

I am trying to use databricks connect. I have installed databricks-connect version 9.1.39 in a virtual environment within my python project. I have selected the python3.8 file in the virtual environment as the interpreter of the vscode project.…
0
votes
1 answer

How to group by 30 minutes interval in Databricks SQL

This is the function I was using to group by 30 mins of intervals in SQL: convert(time(0),dateadd(minute,(datediff(minute,0,a.Datetime)/30)*30,0)) where for example Datetime is 2023-03-09 00:26:01.6830000 grouped as 00:00:00. First column values are…
0
votes
1 answer

Can't repartition rdd when connecting with databricks-connect

When connecting to a databricks cluster with databricks-connect, I get a Py4JJavaError exception when I do a repartition on a simple rdd: from pyspark.sql import SparkSession spark = SparkSession.builder.getOrCreate() rdd =…
fskj
  • 874
  • 4
  • 15
0
votes
0 answers

Uninstall Databricks Connect in CI/CD Pipeline

I have a CI/CD pipeline (with a self-hosted Agent), where I first install databricks-connect and then set the configuration. - script: | pip install --upgrade --force-reinstall databricks-cli pip install --upgrade --force-reinstall…
user3579222
  • 1,103
  • 11
  • 28
0
votes
0 answers

What Azure Databricks cluster policy should be used to allow pyspark, R, scala, SQL and also enable AD passthrough authentication?

I was working with Azure databricks clusters recently and noted that I needed both - AD passthrough authentication to read data from ADLS using Pyspark Use scala on the same cluster to perform different tasks What cluster access mode should be…
0
votes
1 answer

How can I connect a .net application to read a stream from a Delta table in Azure Databricks?

How can I write e.g. a console application in .Net that would read a delta table or open a stream to a delta table in Azure Databricks. I've tried this code var spark = SparkSession .Builder() .AppName("Streaming example with a UDF") …
Mathias Rönnlund
  • 4,078
  • 7
  • 43
  • 96
0
votes
0 answers

DBConnect - Table or view not found

In my development environment, I have implemented a Scala project which connects to the Databricks cluster via dbconnect. Everything is configured well, and it is able to connect to the cluster properly and fetches the data. But it is observed that…
0
votes
1 answer

ENC_KEY_LEN error while running the Queries Azure Databricks

I am facing below mentioned issue while running the queries on Azure Databricks using Oracle Data integrator , please help me resolve this issue. *Caused By: java.sql.SQLException: [Databricks]DatabricksJDBCDriver ERROR processing query/statement.…
0
votes
1 answer

Read Remote S3 File Using Databricks Connect

I am trying to read a file in an S3 bucket using Spark through Databricks Connect. This is the code that I am using, from pyspark import SparkConf from pyspark.sql import SparkSession conf = SparkConf() conf.set('spark.jars.packages',…
Minura Punchihewa
  • 1,498
  • 1
  • 12
  • 35
0
votes
1 answer

Unable to connect to databricks cluster from Windows using databricks-connect

I am trying to setup databricks-connect in my windows machine. While doing databricks-connect test I am getting the below error complaining java certificate is not found. '' Caused by: sun.security.validator.ValidatorException: PKIX path building…
0
votes
1 answer

Looping Through Data Frames with Dynamic withColumn Injection

I'm looking to create a dynamic .withColumn. with the column "rules" being replaced by a list depending on the file being processed. for example: File A has a column called "Validated" that is based on a different condition to File B but has the…
0
votes
0 answers

Running python scripts on Databricks cluster

Is it possible to run arbitrary python script written in Pycharm on my azure Databricks cluster? Databricks offered using databricks-connect but it turned out to be useful for only spark-jobs. More specifically I'd like to like to use networkx to…
0
votes
1 answer

Databrick-connect using the wrong Java version

I setup & configured databricks-connect in a conda env on windows 10. One of the prerequisites is having Java < 8 for it to work. I tried to install Java 8 and even Java 7 from here:…
the phoenix
  • 641
  • 7
  • 15
0
votes
1 answer

Pyspark not working after installing databricks-connect

I installed databrick-connect in a conda enviroment, without having pyspark installed (I read that having pyspark would crash with the installation of databricks-connect). After finishing the configuration of databricks-connect with the…
0
votes
0 answers

Can databricks connect be used when running tests via maven?

We have a maven test framework project, written in scalatest, in IntelliJ A testcase makes use of databricks connect, to read and write to DBFS If we right click and run the testcase, all is successful successful. However if we run the test case via…
user13800089