1

I have benchmarked my algorithm, it run for 1000 times. Now I have all time values and at this point it would be interesting to know the mean, standard deviation, median. The problem is that I don't know what is correct statistics to use to estimate these parameters. I'm not sure about using Normal distribution.

Nico Mkhatvari
  • 103
  • 2
  • 11

2 Answers2

1

Learn about statistics. There are lots of books, guides, papers and introductions out there (1,2,3, 4)
There are also lots of libraries which implements default statistical methods:

And also one last hint: For a quick (initial) result I often use excel and its diagram functions. It supports some statistical methods with which you can play around a bit to see in which direction you may continue....

Community
  • 1
  • 1
Lonzak
  • 9,334
  • 5
  • 57
  • 88
0

That really depends on what distribution your workload experiences, so you would not be able to answer generically to this.

But there is a trick: if you go one step forward, and do several iterations, each consisting of N calls, and compute, say, average time/throughput for the entire iteration. Then, for a large N and consistent workload behavior across the calls, the iteration scores may be subject to Central Limit Theorem, which can turn them into normally distributed.

Aleksey Shipilev
  • 18,599
  • 2
  • 67
  • 86