0

I have a Java application that connects to an Apache Spark cluster and performs some operations. I'm trying to connect to a Databricks cluster on Azure, using databricks-connect 7.3. If I run from the terminal databricks-connect test, everything works perfectly. I'm following their documentation, I included the jars in IntelliJ, added spark.databricks.service.server.enabled true to the cluster in Databricks and used the following to create the SparkSession:

SparkSession spark = SparkSession
                .builder()
                .master("local")
                .getOrCreate();

The problem is that this command connects to a local cluster that is instantiated at runtime, and does not connect to the remote Databricks cluster. Am I missing something?

phcaze
  • 1,707
  • 5
  • 27
  • 58
  • I found the problem, the previous dependencies (from maven) to spark-core and spark-sql where still in the class path. Removing them and exclusively using those of databricks-connect solved the issue. – phcaze Feb 04 '21 at 16:17
  • also remove `.master("local")` - in this case your code will be portable, because master would be setup from the outside – Alex Ott Feb 04 '21 at 16:21

0 Answers0