0

I've setup a DataFactory v2 pipeline with one data bricks notebook activity. The data bricks notebook creates an SQL cursor, mounts the storage account and rejects or stages files(or blobs) that enter into the storage account path that has been set. The infrastructure of the project is shown below:-

enter image description here

I had initially set a trigger that once a files had been uploaded into the blob storage, the pipeline is triggered, which consists of running the databricks notebook, however I found that the pipeline did not trigger. I then attempted to debug the databrick activity on ADF with the default value for the pipeline not-defined parameter, but it came up with an error. However, when I specified the default value for the pipeline parameter (Product.csv), the pipeline ran perfectly.

I had initially set the pipeline parameter using fileName and folderPath leaving the default value empty for both but the pipeline comes up with an error.

I have included the following parameters within my pipeline:- enter image description here

enter image description here

When fileName is defined with the csv name the pipeline works. I believe I should be able to leave the default value empty and that with the parameters defined that the blob/folder name is accessed and passed through the pipeline, which is also picked up from Databricks when executing the notebook. Unsure what I may be doing wrong and would appreciate the assistance!

I would aslo like to know how to ensure that once a file enters blob storage, the pipeline built is triggered

MOT
  • 81
  • 6
  • 1
    have you specified the values for filename parameter and foldername parameter while creating the storage event trigger? You need to specify the values as shown in [this image](https://i.imgur.com/uSf6IaR.png). When the pipeline is triggered using storage event trigger, the details of the file which triggered the file will be assigned to the parameter. Coming to debug, you are testing what output your pipeline would produce (not actually triggering the pipeline). So, you need to give those test values while debugging the pipeline. – Saideep Arikontham Oct 21 '22 at 15:51
  • 1
    Use the reference image and specify the respective parameter values as shown. It should help to assign the value once a file is created/uploaded to storage and the pipeline will be triggered successfully.. – Saideep Arikontham Oct 21 '22 at 15:53

1 Answers1

1

For using the filename and folderpath while using storage event triggers, you need to specify the values to your respective pipeline parameters using triggerBody property when you are creating the required storage event trigger.

for filename: @triggerBody().fileName
for folder path: @triggerBody().folderPath

enter image description here

  • When the pipeline is triggered using storage event trigger, the details of the file which triggered the file will be assigned to the parameter.

  • The reason for debug not working without specifying a default value is that, in debug you are testing what output your pipeline would produce (not actually triggering the pipeline). So, you need to give those test values while debugging the pipeline.

  • Without this, your pipeline would fail as nothing is being passed to your notebook.

  • The following is a demonstration. I have uploaded a file called sample.csv. The pipeline with storage event trigger spots this and triggers the pipeline.

enter image description here

  • When you check the debug output, you can actually see that the values are assigned to the parameters. I used set variable activity to join both with a ,
@{pipeline().parameters.file_name},@{pipeline().parameters.folder_name}

enter image description here

Saideep Arikontham
  • 5,558
  • 2
  • 3
  • 11
  • Thanks for your response. Since I posted the question I resolved the issue. I had not selected trigger on selection. However, I have been having an issue in a previous project where my triggers are not being executed. Will post another question about that! – MOT Oct 22 '22 at 09:08