I am able to pull the data from databricks connect and run spark jobs perfectly. My question is how to run non-spark or native python code on remote cluster. Not sharing the code due to confidentiality.
Asked
Active
Viewed 518 times
3

Alex Ott
- 80,552
- 8
- 87
- 132

Nagarjuna Kanneganti
- 31
- 1
-
https://docs.databricks.com/dev-tools/databricks-connect.html#limitations – Chris Oct 11 '21 at 14:46
1 Answers
1
When you're using databricks connect, then your local machine is a driver of your Spark job, so non-Spark code will be always executed on your local machine. If you want to execute it remotely, then you need to package it as wheel/egg, or upload Python files onto DBFS (for example, via databricks-cli) and execute your code as Databricks job (for example, using the Run Submit command of Jobs REST API, or create a Job with databricks-cli and use databricks jobs run-now
to execute it)

Alex Ott
- 80,552
- 8
- 87
- 132