0

I'm writing a rather convoluted unit test that uses Parallel.ForEach on a list of one million ints.

Using JetBrains DotTrace, I noticed that a lot of garbage collection is going on in the code under test, it's taking up 10% of the total test running time and about 14% of the time spent in that method.

Here's a screenshot of the profiler:

JetBrains DotTrace screenshot showing 10% time spent on GC

The culprit, I guess, is a local variable that captures the current time. Here is the code:

public PatientQueryMessage CreateQueryMessage(string patientNumber)
{
    var nowOffset = DateTimeOffset.Now; // GC nightmare?

    return new PatientQueryMessage
    {
        MessageHeader = new MessageHeader
        {
            DateTimeMessage = new TimeStamp(nowOffset),
            MessageControlId = $"{patientNumber}{nowOffset.Ticks.ToString().Substring(7, 5)}"
            // More properties set, but filtered out for brevity
        },
        QueryDefinition = new QueryDefinition
        {
            QueryDateTime = new TimeStamp(nowOffset),
            QueryId = $"{patientNumber}{nowOffset:MMddHHmmssf}"
            // More properties set, but filtered out for brevity
        }
    };
}

Now, here are the important bits of the test code:

Parallel.ForEach(myNumberArray,
    new ParallelOptions { MaxDegreeOfParallelism = maxThreads },
    () => /* TLocal Init */,
    (number, loopState, localResult) =>
    {
        var queryMessage = _queryMessageHelper.CreateQueryMessage(patientNumber);

        // Brevity...

        return localResult;
    },
    resultList =>
    {
         // Brevity...
    }
);

In the loop body, is there any way I can temporarily shut down or otherwise restrain the Garbage Collector, so that it will only cleanup once everything has settled down?

Note that I care a lot more about the time it takes for the test to run than how much memory it consumes.

MarioDS
  • 12,895
  • 15
  • 65
  • 121
  • @BugFinder Oops, I *did* search but missed that one for some reason... – MarioDS Nov 30 '16 at 15:17
  • You should not shut off the GC, but simply set DotTrace to not count it in. You can right-click on it and ignore it from the analysis. – MakePeaceGreatAgain Nov 30 '16 at 15:17
  • @HimBromBeere how does that change the fact that it's there and that the time it uses is time I must wait for the test to complete? – MarioDS Nov 30 '16 at 15:18
  • Well it's definitely not collecting `nowOffset` because that variable is used further down in the trace log (e.g. `set_messageHeader`). – D Stanley Nov 30 '16 at 15:18
  • What if your unit test does not use parallelism? Parallelism adds a significant amount of overhead that may be causing the GC, not your code. Also, just because it runs that way in your unit test does NOT it always runs that way. The system will collect when it needs to (and has space capacity to) so seeing GC in a unit test does not mean it is deterministic. – D Stanley Nov 30 '16 at 15:19
  • 1
    I don´t understand *why* you want to set it off. Omitting the GC is a quite unrealistic scenario, isn´t it? So why test for it? – MakePeaceGreatAgain Nov 30 '16 at 15:20
  • It's not clear whether or not `var queryMessage` is being retained or has no references at the end of the loop body, but I'd guess that it is these instances that are needing to be GCed. If you made a list and dumped all of your object references into the list, then GC wouldn't get a look in on them. – spender Nov 30 '16 at 15:21
  • @DStanley I don't think it keeps the reference, `DateTimeOffset` is a struct and so it is passed by value. – MarioDS Nov 30 '16 at 15:21
  • @HimBromBeere I don't want to test *for* it, I just want to reduce the time required to run the test. – MarioDS Nov 30 '16 at 15:22
  • @spender it's not being retained, so your guess should be correct, but I was hoping for something cleaner or more direct than "just keep references to it"... – MarioDS Nov 30 '16 at 15:25
  • Unit tests should verify _results_, not _performance_. The test framework (plus the fact that your test runs things in parallel) can add overhead that will not be present in the actual release. If you have a performance problem in the actual system, _then_ you can try and isolate it, but I see nothing in the code you posted that is an obvious candidate for optimization. – D Stanley Nov 30 '16 at 15:26
  • @DStanley I'm not trying to optimize the actual code for time, the test verifies concurrency and thread-safety (the code generates an ID, the IDs need to be unique when requested in parallel). I'm actually trying to reduce the time needed for the test to run (because no one likes tests that take too long). – MarioDS Nov 30 '16 at 15:28
  • @DStanley If you're interested, I would like to chat with you, but unfortunately I don't know how to create a chat room on SO. – MarioDS Nov 30 '16 at 15:29
  • @MarioDS My point is that it is _used_ later on in the program, so it is not a candidate for GC. The fact that it's a struct makes it LESS likely to be GCed since it's more likely to be on the stack. I strongly suspect that the overhead of unit tests and/or parallelism are what is causing the GC more than the OP's code (unless there's something that was redacted) – D Stanley Nov 30 '16 at 15:29
  • @DStanley I'm definitely no expert with parallelism and profiling, but the line in dotTrace shows that it happens as part of the `CreateQueryMessage` method... Or am I just completely misinterpreting that? – MarioDS Nov 30 '16 at 15:31

0 Answers0