In your code sample (and you would see this with a profiler) you are wasting a LOT time waiting for available resources to run those threads. Because you are constantly requesting more and more Parallel.For
(which is a non-blocking call) - your process is spending significant time waiting for threads to finish and then the next thread to be selected (an ever growing amount of such threads all requesting time to run).
Consider this output from the profiler:
The RED color is synchronization! Look how much work is going on by the kernel to let my app run so many threads! Note, if you had a single core processor, you'd definitely see 100%

You're going to have the best time reading this xml by splitting the string and parsing them separately (post-load from I/O of course). You may not see 100% cpu usage, but that's going to be the best option. Play with different partition sizes of the string (i.e. substring sizes).
For an amazing read on parallel patterns, I recommend this paper by Stephen Toub: http://download.microsoft.com/download/3/4/D/34D13993-2132-4E04-AE48-53D3150057BD/Patterns_of_Parallel_Programming_CSharp.pdf
EDIT I did some searching for a smart way to read xml in multiple threads. My best advice is this:
- Split your xml files into smaller files if you can.
- Use one thread per xml file.
- If 1&2 aren't sufficient for you perf needs, consider not loading it as xml completely, but partitioning the string (splitting it), and parsing a bit by hand (not to an XmlDocument). I would only do this if 1 and 2 are good enough for your needs. Each partition (substring) would run on its own thread. Remember too that "more threds" != "more cpu usage", at least not for your app. As we see in the profiler example, too many threads costs a lot of overhead. Keep it simple.