1
  1. I have created an Azure ml pipeline consisting of 4 steps. First, two steps are python script steps and the 3rd one is databricks step and 4th one is also python script step. I am creating a pipeline data and passing it to all subsequent steps.

prepped_data_parameter = PipelineData('prepped_parameter', datastore=data_store)

2nd python step can read the value from pipeline data but it is not working in databricks step. 2. I have also tested passing data from one databricks step to another databricks step, thinking that dbfs path might be causing the problem. Here also it is not working.

python script step produces path like this when i make pipeline data :

``` /mnt/batch/tasks/shared/LS_root/jobs/******/azureml/bb743894-f7d6-4125-***-bccaa854fb65/mounts/workspaceblobstore/azureml/******-742d-4144-a090-a8ac323fa498/prepped_parameter/ ```

Databricks step produces like this for the same :

wasbs://azureml-blobstore-*******-983b-47b6-9c30-d544eb05f2c6@*************l001.blob.core.windows.net/azureml/*****-742d-4144-a090-a8ac323fa498/prepped_parameter/

I want to know how I can efficiently pass pipeline data from python to databricks step or vice versa without manually storing the data into datastore and deleting it for intermediate pipeline data.

JustAG33K
  • 1,403
  • 3
  • 13
  • 28

0 Answers0