1

I'm reposting this from a Msft forum hoping to get more responses here. Apologies if that is frowned upon. ... I'm currently looking at refactoring an existing, large SSIS 2012 implementation that consists of about 55 projects and 360+ packages. The ETL framework that is in use has a "main" control package that reads from a database table and determines which packages are ready to execute (based on some dependency logic) and then uses an Execute Process task within a loop that calls dtexec with the arguments:

/C start Dtexec /SQL "SomePackagePath" /SERVER "someserver"

This design allows the loop to execute a package and then immediately iterate because it doesn't wait for the package to respond (aka complete with a failure or success) so it can quickly kick off as many packages are ready to execute. A SQL Agent job is used to call this package every few minutes so that it can pick up any packages that have had their dependencies satisfied since the last execution and kick those off.

It's a a clever design but has some problems such as decentralized exception handling (since the parent package is unaware of what is happening in the "asynchronous" dtexec calls.

My biggest concern is that by executing packages, not with the Execute Package Task but with the Execute Process Task, and spinning up many dtexecs, the framework is not leveraging SSIS's ability to handle threading, memory consumption, etc. across all running packages and executables because it is simply unaware of them. It's essentially like using an Execute Package Task with the ExecuteOutOfProcess property set to true.

I'm wondering how much risk is introduced by this design and if there are any approaches I can take to minimize the risk of excess resource consumption. I apologize for not having an exact question here but would really appreciate any guidance anyone might have.

d.elish
  • 29
  • 3
  • This seems like a design that was created to solve a problem that escapes the both of us. – Eric Hauenstein Sep 01 '15 at 16:02
  • 2
    I would assume the design pattern is exactly as you state: get the packages started as soon as possible and don't let a failure in a child package impede the ability to start the next package. As long as there is *some* process *somewhere* that actually goes looking for failures, then I have no hatred for this. Since this is 2012, I'd be curious if they had a reason for not just running the packages in async mode from the SSISDB catalog but perhaps this design pattern was originally implemented on a 2005/2008 system. – billinkc Sep 01 '15 at 16:08
  • This was implemented pre-SSIS 2012. The framework does look for failure and ensures that dependencies are met prior to executing additional packages. Do you not see an issue with resource management being that each dtexec is not aware of what the other dtexecs are doing? – d.elish Sep 01 '15 at 16:13
  • Eric - the intention of the design was to create a metadata-driven dependency framework that allows for parallelism. It accomplishes this but at the cost of centralized exception handling and the overhead of kicking off a bunch of dtexecs. I'm all ears as to any better way to accomplish this. – d.elish Sep 01 '15 at 16:20

0 Answers0