4

I want to calculate mean,std, skewness, kurtosis and covariance using one pass algorithms. The simplest and fastest one approach I found was published by Stuart McCrary from Berkeley Research Group. For example for std one may use:

std = sqrt((sum(x^2)-N*mean(X)^2)/(N-1))

I read that this approach is not good enough, as it is numerically unstable. Unfortunately, I have no deep understanding of numerical stability, but as I understand it is some problem, which happens because of limited precision of floating points operations.

In my case, I will deal only with integer numbers from 10^1-10^6 range.

May I use this approach in my case and do not take care about numerical stability?

zlon
  • 812
  • 8
  • 24
  • 1
    "While the textbook method can produce accurate results most of the time, a level of uncertainty remains that perhaps a particular trial pushes into an area where the textbook method is inaccurate." The research itself does not give details of its limitations! – Bassem Mar 26 '18 at 07:55
  • @BassemAkl Research not, but in plenty different places it is written, that equation above is the fastest and simplest, but suffers from numerical instability. – zlon Mar 26 '18 at 07:58

1 Answers1

0

To improve the numerical stability, you can normalize the data. See: Wikipedia: normalization

For example: you have a data set X_1, ..., X_n, with the mean x_bar, standard deviation s. Then normalizing the data by: (X_i - x_bar) / s

Stephen Rauch
  • 47,830
  • 31
  • 106
  • 135