0

I am trying to parse the $$FILEPATH value in the "Additional columns" section of the Copy Activity.

The filepaths have a format of: time_period=202105/part-12345.parquet . I would like just the "202105" portion of the filepath. I cannot hardcode it because there are other time_period folders.

I've tried this (from the link below): @{substring($$FILEPATH, add(indexOf($$FILEPATH, '='),1),sub(indexOf($$FILEPATH, '/'),6))} but I get an error saying Unrecognized expression: $$FILEPATH

The only other things I can think of are using: 1) Get Metadata Activity + For each Activity or 2) possibly trying to do this in DataFlow

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Mariah Akinbi
  • 386
  • 1
  • 5
  • 19

1 Answers1

1

$$FILEPATH is the reserved variable to store the file path. You cannot add dynamic expression with $$FILEPATH.

You have to create a variable to store the folder name as required and then pass it dynamically in an additional column.

Below is what I have tried.

  1. As your folder name is not static, getting the folder names using the Get Metadata activity.

enter image description here

Get Metadata Output:

enter image description here

  1. Pass the output to the ForEach activity to loop all the folders.

enter image description here

  1. Add a variable at the pipeline level to store the folder name.

enter image description here

  1. In the ForEach activity, add the set variable activity to extract the date part from the folder name and add the value to the variable.

    @substring(item().name, add(indexof(item().name, '='),1), sub(length(item().name), add(indexof(item().name, '='),1)))
    

enter image description here

Output of Set variable:

enter image description here

  1. In source dataset, parameterize the path/filename to pass them dynamically.

enter image description here

  1. Add copy data activity after set variable and select the source dataset.

a) Pass the current item name of the ForEach activity as a file path. Here I hardcoded the filename as *.parquet to copy all files from that path (this works only when all files have the same structure).

b) Under Additional Column, add a new column, give a name to a new column, and under value, select to Add dynamic content and add the existing variable.

enter image description here

  1. Add Sink dataset in Copy data Sink. I have added the Azure SQL table as my sink.

enter image description here

  1. In Mapping, add filename (new column) to the mappings.

enter image description here

  1. When you run the pipeline, the ForEach activity runs the number of items in the Get Metadata activity.

enter image description here

Output:

enter image description here

NiharikaMoola-MT
  • 4,700
  • 1
  • 3
  • 15
  • THANK YOU!! I think I am really close. I get an error on Step 8. My container structure has an additional level and I'm not sure how to include that level with the dynamic variable, @item().name. I cannot preview data nor import mapping. Here is what I'm seeing: https://ibb.co/FzbWn9S – Mariah Akinbi Oct 11 '21 at 15:45
  • 1
    Can you try using concat in the expression. Add dynamic content--> @concat('kpi_test/', item().name) – NiharikaMoola-MT Oct 11 '21 at 15:49
  • 1
    THAT WORKED!! Thank you thank you! Wow that's great! Your answer is so detailed -it is much appreciated!! Thanks again https://ibb.co/qrHF4JC – Mariah Akinbi Oct 11 '21 at 16:29
  • You have no idea how often this will be reapplied! :) How did you learn ADF? A course, building, reading? – Mariah Akinbi Oct 11 '21 at 16:33