2

I'm trying to build a small benchmarking application, which allows the user to measure the execution time and memory usage of a program.

This is the code to measure the execution time:

private static Stopwatch _stopwatch;

static void Main(string[] args]
{
    _stopwatch = new Stopwatch();

    //only for this test
    Console.WriteLine(TimeProcess("Empty.exe"));
    Console.WriteLine(TimeProcess("Sieve.exe"));
    Console.ReadKey();
}

private static long TimeProcess(String name)
{
    Process process = new Process();
    process.StartInfo.FileName = name;

    _stopwatch.Reset();
    _stopwatch.Start();
    process.Start();
    process.WaitForExit();
    _stopwatch.Stop();

    return _stopwatch.ElapsedMilliseconds;
}

To see if the code is working properly, I decided to implement the "Sieve of Eratosthenes" algorithm. I implemented it twice, one time with a built in stopwatch, and one time without.

Sieve

int[] numbersToTest = Enumerable.Range(0, 1000).ToArray();
int posInArray = 2;

while (numbersToTest[posInArray] != numbersToTest[numbersToTest.Length - 1])
{
    numbersToTest = numbersToTest.Where(x => x % numbersToTest[posInArray] != 0 || x == numbersToTest[posInArray]).ToArray();
    posInArray++;
}

TimedSieve:

Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
int[] numbersToTest = Enumerable.Range(0, 1000).ToArray();
int posInArray = 2;

while (numbersToTest[posInArray] != numbersToTest[numbersToTest.Length - 1])
{
    numbersToTest = numbersToTest.Where(x => x % numbersToTest[posInArray] != 0 || x == numbersToTest[posInArray]).ToArray();
    posInArray++;
}
stopwatch.Stop();
Console.WriteLine(stopwatch.ElapsedTicks);

Moreover, I have a project with an empty main method. My logic was, that when I measure the execution of the "Sieve" and subtract the time of the empty project, the resulting number should be roughly the same as the number measured by the "TimedSieve". So I started measuring...

Empty: 79 milliseconds
Sieve: 53 milliseconds
TimedSieve: 4 milliseconds

Obviously these results seem very fishy:

  1. The TimedSieve is a lot faster than both the empty project and the Sieve
  2. The empty project is slower than the Sieve!

Just out of curiosity I also timed the Sieve and the empty project using Powershells "Measure-Command"

Sieve: 25 milliseconds
Empty: 17 milliseconds

Something I noticed was that the fact that the order in which the processes were measured influenced the results, the process which was measured first always lost. I also noticed that moving the start of the stopwatch after the start of the process like this

process.Start();
_stopwatch.Start();

got rid of the aforementioned effect (Empty is now always faster than Sieve) and produced numbers which are a lot closer to the results of the other measurement methods

Empty: 34
Sieve: 42

While trying to solve the problem, I also read that benchmarks should include a "warmup" round, and I decided to benchmark both programs multiple times and take the average to get better results.

static void Main(string[] args)
{
    _stopwatch = new Stopwatch();

    //discard results of the first run
    TimeProcess("Sieve.exe");

    long sum = 0;
    for (int i = 0; i < 100; i++)
    {
        sum += TimeProcess("Sieve.exe");
    }
    Console.WriteLine(sum/100);

    TimeProcess("Empty.exe");

    sum = 0;
    for (int i = 0; i < 100; i++)
    {
        sum += TimeProcess("Empty.exe");
    }
    Console.WriteLine(sum/100);
    Console.ReadKey();
}

This got rid of the "empty slower than sieve" effect, which is why I decided to start the stopwatch before the process again.

How can I improve this code to get reliable results? While the numbers have gotten a lot more reasonable, they are still slower than both the Powershell and the TimedSieve measurements.

Aifu
  • 21
  • 2
  • You are likely measuring disk IO time in there, which is why the results are fishy. The first process is slower because the disk has to seek, whereas the second process it is likely already seeked to, and just goes right into. Unfortunately, there is no easy way to remove this effect. The reason your second solution (start it once, then run it 100 times) is because Windows caches a programme into memory. If the cache is undisturbed (as it would be in that scenario) then it can just load the programme from the cache. – Der Kommissar May 14 '15 at 14:45
  • Thank you, that seems like a reasonable explanation of the first results, I doubt it would've entered my mind – Aifu May 14 '15 at 14:53
  • 1
    It's an often overlooked explanation, sometimes we forget (all of us programmers, even) that there are other things that go on at the beginning of a programme. Disk File IO, memory allocations, heap allocations, and so on. – Der Kommissar May 14 '15 at 14:55

1 Answers1

0

Measuring the execution time of some another process in a non-real system OS can lead to different inconsistent results. That is just the nature of systems without(mostly) timing guarantees.

You have already more or less dealt with warm-up(IO and IO caching related...) issues, and made results more statistically correct with multiple runs.

But TimeOf(Algo)doesn't truly equal toTimeOf(FullAppWithAlgo) - TimeOf(Empty).

It is close, but TimeOf(Algo) doesn't include:

  1. Time spent to JIT the application with code and not just empty Main,
  2. Time spent to initialize static classes(that doesn't occur if class is never used, like in Empty)
  3. Time for other minor and major things that happen.

They may be small time spans, but they still add to the execution time of full app in a way that differs from Empty one.


Also, to make results more close to those given by PowerShell you may try to use Process.StartTime and Process.ExitTime in place of StopWatch while measuring full application runtime:

private static long TimeProcess(String name)
{
    Process process = new Process();
    process.StartInfo.FileName = name;

    process.Start();
    process.WaitForExit();

    return (process.EndTime - process.StartTime).TotalMilliseconds;
}

It won't change the difference between the empty and the full app, but will give more consistent timings for each run, because you won't have to wait for notification by OS, that obviously occurs some time after the app has already ended.

Community
  • 1
  • 1
Eugene Podskal
  • 10,270
  • 5
  • 31
  • 53
  • Thank you for this very detailed answer! I actually wanted to use Stopwatch because of its high precision, but considering that most of my measurements will be above the millisecond range I think your DateTime solution is superior – Aifu May 14 '15 at 15:41
  • @Aifu Yes, it will give less precision, but for relatively large execution times and multiple runs it will be good enough. Also, as I've already said it will give real execution time without pre-startup and notification time unlike the StopWatch. – Eugene Podskal May 14 '15 at 16:37