0

I'm working on an ETL pipeline in Azure Synapse.

In the previous version I used an Array set as a parameter of the pipeline and it contained JSON objects. For example:

[{"source":{"table":"Address"},"destination {"filename:"Address.parquet"},"source_system":"SQL","loadtype":"full"}}]

This was later used as the item() and I used a ForEach, switch, ifs and nested pipelines to process all the tables. I just passed down the item parameters to the sub pipelines and it worked fine.

My task is now to create a dedicated SQL pool and a table which stores parameters as columns. The columns are: source_table, destination_file, source_system and loadtype.

Example:

source_table destination_file source_system loadtype
"Address" "Address.parquet" "SQL" "full"

I don't know how to use this table in the ForEach activity and how to process the tables this way since this is not an Array.

What I've done so far: I created the dedicated SQL pool and the following stored procedures:

  • create_parameters_table
  • insert_parameters
  • get_parameters

The get_parameters is an SQL SELECT statement but I don't know how to convert it in a way that could be used in the ForEach activity.

CREATE PROCEDURE get_parameters AS BEGIN SELECT source_table, destination_filename, source_system, load_type FROM parameters END

All these procedures are called in the pipeline as SQL pool stored procedure. I don't know how to loop through the tables. I need to have every row as one table or one object like in the Array.

Mons
  • 1
  • Is this table stored in dedicated SQL pool – Aswin Jan 23 '23 at 10:30
  • Hi Aswin, yes, the table is stored in a dedicated SQL pool that has been created in the Azure Synapse workspace, that's why I was trying to experiment with stored procedures. I want to use this table as table parameters. 1 row indicates 1 table (like 1 object in the JSON list). – Mons Jan 23 '23 at 10:33
  • You used the json data as an item in foreach. Now instead of Json data, you have data in table. Is this correct? – Aswin Jan 23 '23 at 10:36
  • The stored procedures create the table and the get_parameters come back with the parameters and their values, but I don't know how to loop through these like I did with the JSON list. I also don't know how I could set this as item() in the foreach, switch, if or nested pipelines. – Mons Jan 23 '23 at 10:37
  • Yes I used the JSON list in the beginning. In the Foreach there is a switch that decides if item().loadtype is SQL or not, then if it is, it goes to a sub pipeline. In there there is an if which decides if it's a full load or not, and for example if it's a full load a query might look like this: select * from [schema].[@{item().source.table}] – Mons Jan 23 '23 at 10:41

1 Answers1

0
  • Take a lookup activity in Azure data factory/ Synapse pipeline and in source dataset of lookup activity, take the table that has the required parameter details.

  • Make sure to uncheck the first row only check box. enter image description here

  • Then take the for-each activity and connect it with lookup activity.

  • In settings of for-each activity, click add dynamic content in items and type @activity('Lookup1').output.value

enter image description here

  • Then you can add other activities like switch/if inside the for-each activity.
Aswin
  • 4,090
  • 2
  • 4
  • 16
  • thank you, but will this also work with Azure Synapse dedicated SQL pool? i can see that in the lookup activity i can select the source as the dedicated sql pool but since the table is created in the pipeline by stored procedures beforehand and it is called "parameters" and also there is no database that could be created with dedicated, how should i refer to the table? – Mons Jan 23 '23 at 17:27
  • Whether the stored procedure in SQL pool creates table in Dedicated SQL pool? – Aswin Jan 24 '23 at 03:40