1

I have a collection of n floating point values: x[n]. When I want to calculate the meanvalue and standard deviation, I need to iterate with two loops over all values:

First loop to sum all values and calculate the meanvalue:

sum = 0
for(i=0; i<n; i++)
    sum += x[i]
mean = sum/n

In a second loop I calculate the standard deviation:

sum = 0
for(i=0; i<n; i++)
    sum += pow2(x[i] - mean)
sder = sqrt(sum/n)

I am aware that you cannot reduce this complexity if you want to the exact values for meanvalue and standard deviation. But is there a way to calculate them in less time if you just approximate? Favoured in one loop.

RomCoo
  • 1,868
  • 2
  • 23
  • 36

2 Answers2

3

Have a look at this section of the wiki on standard deviation, in particular the last formula leads to the following algorithm:

    sum = 0;
    sumsqrd = 0;

    for(i = 0; i < n; i++)
        sum += x[i]
        sumsqrd += x[i] * x[i]

    mean = sum / n
    stddev = sqrt(sumsqrd / n - mean * mean)
SirGuy
  • 10,660
  • 2
  • 36
  • 66
  • Oh man, I should have better payed attention in maths. I didn't think it would be that easy. – RomCoo Jul 07 '16 at 17:23
  • 3
    It's worth noting that the numerical stability of this algorithm is worse than that of computing the mean first and then computing the root mean square deviation from the mean. For example, with IEEE 754 doubles, it gives a standard deviation of 0 for [1e8+1, 1e8-1]. If you were going to accept an approximation anyway, that's probably fine, but it'd be wrong to think that this algorithm doesn't have downsides. – user2357112 Jul 07 '16 at 17:32
3

Here's a version which does the calculations in one pass, and is computationally more stable:

mean = 0.0
sum_sqrs = 0.0
n = 0

loop do
  x = get_x()
  break if x == nil
  delta = x - mean
  n += 1
  mean += delta / n
  sum_sqrs += delta * (x - mean)
end
sample_var = sum_sqrs / (n - 1)

This is based on the formulas found in the bottom half of the Rapid calculation methods section of the Wikipedia page for Standard deviation.

pjs
  • 18,696
  • 4
  • 27
  • 56
  • 1
    @JamieMarshall May I suggest that you delete your comments since they were based on an implementation error? – pjs Jul 31 '23 at 01:24