I'm now using password stretching for all user account passwords throughout all my websites. In the db I store an iteration count and randomly assigned salt along with the final hash. I'm using SHA512 as the hash algorithm. I'm using C# in .Net 3.5 and 4.0 (dual framework library) for this.
For accounts that only ever get randomly assigned passwords (things like web service API users etc) I keep the iteration count down to a range such that a password check takes no more than 1 second or so. Over the years, according to whether or not these websites stick(!), I will look at increasing these ranges in alignment with CPU power.
For accounts where the user might be choosing the password themselves, I have cranked up the iteration count so a login can take around 5 seconds while the iterations are carried out.
So I'm happy with the security of my passwords; but now I have another problem - I can flood an 8 core cpu with 100% usage for 5 seconds if I get 8 different people to login at once!
My current solution to this is to have an iteration threshold: If a stretch operation exceeds this, I push it on to a queue that is handled by a single thread. I could extend this further so that it uses at most half the processors in the machine.
Is there anything better I can do? Have you implemented this pattern for password storage and logon - what did you do?