Questions tagged [databricks-dbx]

28 questions
0
votes
1 answer

Default values in Databricks deployment.yaml file

In our deployment.yaml file we have basically they same instructions for each environment, but there are some settings I might want to se differently per environment, e.g. schedules. Can I e.g. define a default profile, where I would put the steps…
Mathias Rönnlund
  • 4,078
  • 7
  • 43
  • 96
0
votes
2 answers

Databricks DBX pass parameters to notebook job

For a standard deployment.yaml file for dbx databricks as given below: workflows: - name: "your-job-name" job_clusters: - job_cluster_key: "basic-cluster" <<: *basic-static-cluster - job_cluster_key:…
Tarique
  • 463
  • 3
  • 16
0
votes
0 answers

Using databricks dbx jinja teamplate creates duplicate workflows

I am deploying several workflows with a common deployment file and passing variables with dbx jinja option: dbx deploy --jinja-variables-file=conf/vars.yml If I change anything at the deployment file or any of the variables in vars.yml, this…
Kotka
  • 481
  • 4
  • 10
0
votes
1 answer

ModuleNotFoundError: No module named 'autoreload' on 12.2 LTS

I am using dbx to work on mixed-mode development loop. This is the link in case you want to read about it. This are the steps: First cell: import autoreload %load_ext autoreload %autoreload 2 Second cell: from pathlib import Path import…
jalazbe
  • 1,801
  • 3
  • 19
  • 40
0
votes
0 answers

ERROR installing dbx - pip install dbx - pipenv error cffi

I am trying to set a connection with Visual Studio Code with Databricks using pyenv following these instructions: https://docs.databricks.com/dev-tools/ide-how-to.html When I try to install dbx (pip install dbx) I have an error on the cffi package.…
Nankin
  • 45
  • 7
0
votes
1 answer

databricks dbx execute - how to check cluster standard error log

When I use databricks connect I can see standard error log via my local shell. Now I am using databricks dbx, only shows dbx log... Is there way to check cluster log easily(standard error)? Standard error/log4j out of databricks cluster Update when…
Benny
  • 9
  • 2
0
votes
0 answers

Running python scripts on Databricks cluster

Is it possible to run arbitrary python script written in Pycharm on my azure Databricks cluster? Databricks offered using databricks-connect but it turned out to be useful for only spark-jobs. More specifically I'd like to like to use networkx to…
0
votes
0 answers

How to make job wait for cluster to become available

I have a workflow in Databricks called "score-customer", which I can run with a parameter called "--start_date". I want to make a job for each date this month, so I manually create 30 runs - passing a different date parameter for each run. However,…
Average_guy
  • 509
  • 4
  • 16
0
votes
2 answers

Deploy sql workflow with DBX

I am developing deployment via DBX to Azure Databricks. In this regard I need a data job written in SQL to happen everyday. The job is located in the file data.sql. I know how to do it with a python file. Here I would do the following: build: …
andKaae
  • 173
  • 1
  • 13
0
votes
1 answer

Refactoring AzureML pipeline into dbx pipeline with deployment file

My company is in the process of migrating all our pipelines over to Databricks from AzureML, and I have been tasked with refactoring one of our existing pipelines made with azureml-sdk (using functions such as PipelineData, PythonScriptStep etc.),…
Average_guy
  • 509
  • 4
  • 16
0
votes
1 answer

databricks-dbx HTTPError 403 Client Error

I am running some jobs using: dbx version 0.7.4 pyspark 3.2.2 delta-spark 2.0.0 Python 3.8.1 I am following the guidelines from : https://dbx.readthedocs.io/en/latest/features/assets/?h=dbx+launch+assets I run the following commands dbx deploy…
jalazbe
  • 1,801
  • 3
  • 19
  • 40
0
votes
2 answers

How to install spark-xml library using dbx

I am trying to install library spark-xml_2.12-0.15.0 using dbx. The documentation I found is to include it on the conf/deployment.yml file like: custom: basic-cluster-props: &basic-cluster-props spark_version: "10.4.x-cpu-ml-scala2.12" …
jalazbe
  • 1,801
  • 3
  • 19
  • 40
0
votes
1 answer

Nested Python package structure and using it to create Databricks wheel task

Problem understanding python package structure and how to use it to trigger python wheel task in Databricks . So, it could either be something fundamental related to python packages/modules that I misunderstand or something specific to databricks. I…
1
2