1

I successfully installed databricks-connect and it works fine for me, but for some reasons, I want to use a local SparkSession , so just by running a SparkSeesion.builder.getOrCreate() it will use the databricks cluster not a local session in my Laptop, Any ideas please without creating a new virtualenv,

Thanks

Majdi
  • 73
  • 8

1 Answers1

0

That's default behavior of the Databricks connect - it always uses remote executors even if the master is local. But do you really need it for your development? Standard approach could be to use pyspark locally, and run tests on Databricks from the CI/CD pipeline - this is approach is demonstrated by cicd-templates project, where unit tests are executed locally with pyspark, and integration tests - on Databricks clusters, triggered from CI/CD pipeline

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
  • Exactly it is what I want to do, but as databricks is configured in my venv , when I create a sparkSession always it will uses remote executors, the only solution I founded is to create another venv without databricks and switch to it when I want to have a locally pysparkSession – Majdi Mar 03 '21 at 08:34