Microsoft's SCCM and various other product will do this. However, they let you do the specific task: Install the software. The big question is how do you coordinate this?
In "The Practice of System and Network Administration" there is a chapter that recommends the following methodologies:
"one, some, many" -- Upgrade your own machine and test for a few days. Upgrade a few more (say, the other sysadmins on your team). Then roll out to "many": larger and larger groups.
"canary" -- Upgrade one every [period of time] until done.
"exponential" -- Upgrade 1, then 2 more, then 4 more, then 8 more. The group size doubles each time.
"risk-adverse last" -- Divide the organization into groups and do the most risk-accepting first, the more risk-adverse last. For example, there may be one group that prides itself on being cutting edge and will volunteer to go first (the IT department, the engineering department). There may be a group that is very suspicious of upgrades and they go last (accounting dept, the executives, etc.) Smaller groups should probably go first too.
No matter how you group the upgrades, upgrades should be tested first, and.
After each "group" of upgrades, do a series of tests. If any tests fail, or if problems are reported, stop doing upgrades. Revert back to the previous release if possible (or safe).
The upgrades should not start until you have done your own tests. For example, in a lab or on your own machine. More structured tests would include trying the upgrade on one of each kind of machine, one of each release of the OS, and so on. Tests should include starting and stopping the software, as well as running its major functions (since you mentioned Flash: try playing a video, running a flash game, and so on). It is good to keep a wiki page that documents what combinations where tested and what tests you ran. The next time you upgrade this package you have a good list of tests to use. If a problem is reported during upgrades, add a test to the list to prevent that problem in the future. Since you mentioned Flash, I recently found a problem with the Weight Watcher's food tracker app and a certain version of Flash. We added the URL for that app to the list of tests and now we know that new Flash upgrades must be tested with that before we release it.
Between each "group" of upgrades, pause for some amount of time to see if errors crop up. Whether this is a day or a week depends on many factors: is this a big change? were the previous group upgrades successful? Monitor your Helpdesk tickets for reports of problems related to the upgrade. If you have full-time helpdesk attendants, keep them apprised of what upgrades are in progress so they are on the lookout for problems.
Whether you use "one, some, many" or other methodologies depends on many factors. "One, some, many" is good in smaller environments. "Exponential" is good in a large desktop environment where hundreds of machines are centrally controlled. "Risk-adveres last" is good when you can divide your users into specific groups that have different "personalities". "Canary" is used on web farms and grid computing where you have hundreds or thousands of machines all with the same configuration.
The most important thing is to take good notes. If you had to do a good upgrade once, you'll have to do more upgrades in the future. You want the process to become repeatable and keeping a list of tests performed is key for that. The next time you do a similar upgrade there will be less thinking you have to do, which means fewer mistakes ("oops, I forgot to test blah-blah-blah") and it will go faster. In fact, if you just keep basic documentation then you can delegate this to that new, junior, sysadmin you've hired. He or she can repeat your process, add to it, and improve it. You can focus on training them and checking their work. Meanwhile you can work on other projects.