0

The sort of metadata that I am after includes file sizes, number of rows, file names, if the file has already been processed etc. and I want to capture the flow of data from source to target including capturing data from Azure data lake and SQL DB.

I also want to store this metadata into SQL tables as a control table and a test of how the files/tables/data has changed over the entire ETL/ELT process.

The only way I could think of doing this was by using stored procedures in ADF that collect the metadata for each part and store in SQL tables but I wasn't sure how I could read the metadata from the files in the data lake.

Has anyone come up with an approach on how to do this or even a better solution.

mussi89
  • 99
  • 1
  • 3
  • 10

1 Answers1

0

You could use get metadata of data lake files via GetMetaData Activity.Based on the official document, the output from GetMetadata Activity can be used in conditional expressions to perform validation.

It supports Azure data lake connectors:

enter image description here

Jay Gong
  • 23,163
  • 2
  • 27
  • 32
  • I tried using this to recursivly read files in a folder but the information I got from the child elements was not sufficient enough. I coundn't find a good resource to allow me to update the JSON scripts to retrieve what I want? Also I wasn't sure how to store the output information too. – mussi89 Mar 15 '19 at 10:52