I am working on a T-SQL script with the following logic:
declare @tbl table(id int)
insert @tbl select distinct id -- <-- this takes about 3 minutes
from link1.some_table
where <some_conditions>
insert some_local_table(field1, field2, fieldN) -- <-- this takes about 30 minutes
select field1, field2, fieldN
from link2.some_table
where id in (select id from @tbl)
and <some conditions>
insert some_OTHER_local_table(field1, field2, fieldN) -- <-- another 30 minutes
select field1, field2, fieldN
from link2.some_OTHER_table
where id in (select id from @tbl)
and <some conditions>
insert some_YET_OTHER_local_table(field1, field2, fieldN) -- <-- another 30m, and so on
select field1, field2, fieldN
from link2.some_YET_OTHER_table
where id in (select id from @tbl)
and <some conditions>
... and so on, there are 5 inserts into 5 different local tables sourced from linked server link2 but filtered using @tbl which originates from linked server link1.
Populating each of the local tables takes about 30 minutes so the whole script takes some 2.5 hours.
I am trying to parallelize this process so I am putting each of the 5 inserts into a separate SSIS task and executing them concurrently. However, I have no clue how to share the @tbl table variable between tasks in SSIS. In a script, @tbl is just a variable. In SSIS there is no table variables. Btw. the table holds about 200K id values so building a comma-separated string is out of reach.
So the question is: how to implement a shared, "disposable", package-level recordset object that would be initialized from a database query and then used in a couple of parallel tasks within the same package?
I am on SQL Server 2008 R2.