3

I am trying to create a copy activity between two Azure Data Lakes GEN1. I don't need to copy all the folders from the source Data Lake, for example if I have the following directory structure:

rootFolder/subfolder/2015
rootFolder/subfolder/2016
rootFolder/subfolder/2017
rootFolder/subfolder/2018
rootFolder/subfolder/2019
rootFolder/subfolder/2020

I would just want to copy the data from folders from 2017 onwards.

Is there a way to implement this automatically without specifying the field as a parameter and setting it when the pipeline run?

Mikel Laburu
  • 157
  • 1
  • 12

1 Answers1

1

Use Get MetaData Activity,For Each Activity,If Condition Activity may implement your requirement.Please refer to my idea:

Firstly, my test files resides in the ADLS as below:

enter image description here

test1.json in 2016, test2.json in 2017, test3.json in 2018

In ADF, 1st layer:

enter image description here

Dataset for Get Metadata Activity:

enter image description here

enter image description here

Configuration for For Each Activity:

enter image description here

Then,2nd layer:

enter image description here

enter image description here

Finally,3rd layer:

enter image description here

Source Dataset in copy activity:

enter image description here

Test result,only test1 and test2 was pulled out.

enter image description here

So,it does works for me.Any concern,pls let me know.

Jay Gong
  • 23,163
  • 2
  • 27
  • 32
  • First of all thanks for your answer, but I have a couple more questions. Would it be possible to have a wildcard folder path in the source dataset? such that the path is something like for example: `rootFolder/subfolder1/*/subfolder3/ 2017`. And would this logic be possible if the final folders were `YYYY=2017`, `YYYY=2018`.. instead of just the year? Therefore the path would be `rootFolder/subfolder1/*/subfolder3/YYYY=2017`. Thank you very much for your help once again. – Mikel Laburu Mar 19 '20 at 08:12
  • @MikelLaburu So,i got your main concern. You folder path named rule is `YYYY=2017`, not `2017`. So you just wanna know how to compare that further,am i right? – Jay Gong Mar 19 '20 at 09:44
  • Yes, and sorry if they are somewhat simple questions but I am new to azure and azure data factory. – Mikel Laburu Mar 19 '20 at 09:52
  • @MikelLaburu If i got you correct, the folder name will have "YYYY=2017" instead of "2017". In that case if condition logic instead of using @lessOrEquals(int(item().name),2017), you need to take substring of endpart [YYYY=2017] as follows: @lessOrEquals(int(substring(item().name,5,4)),2017). Hope it hepls :) – Hyndavi Mar 19 '20 at 10:00
  • @Hyndavi Thank you for your sharing, your solution is absolutely right. – Jay Gong Mar 20 '20 at 01:06
  • @MikelLaburu Hi,you could use `substring` method to refine the expression, @Hyndavi's idea is right. Any further question,just let me know. – Jay Gong Mar 20 '20 at 01:07