I'm writing a high-ish volume web service in C# running in 64-bit IIS on Win 2k8 (.NET 4.5) that works with XML payloads and does a variety of operations on small and large objects (where the large objects are mainly strings, some over 85k (so going onto the LOH)). Requests are stateless, and memory usage remains steady over time. Lots of memory is being allocated and released per request, no memory appears to be being leaked.
Operating at a maximum of 25 transactions per second, with an average call lasting 5s, it's spending 40-60% of it's time in GC according to two profiling tools, and perfmon shows a steady 20 G0 and G1 collections over 5 seconds, and 15 G2 collections over 5 seconds - meaning lots of (we think) premature promtion into G2 for data that we'd expect to stay in G0. Everything I read indicates this is very excessive. We expect that the system should be able to perform at a higher throughput than 25 tps and assume the GC activity is preventing this.
The machines serving the requests have lots of memory - 16GB - and the application, under load, consumes at most 1GB when under load for an hour. I understand that a bigger heap won't necessarily make things better, but there is spare memory.
I appreciate this is light on specifics (will try to recreate the conditions with a trivial application if time permits) - but can anyone explain why we see so much G2 GC activity? Should I be focusing on the LOH? People keep telling me that the CLR's GC "adapts" to your load, but it's not changing it's behavior in this case and, unlike other runtimes, there seems to be little I can do to tune it (have tried workstation GC, but there is very little observable difference).