One pass common statistics. Numerical stability for integers

Question

I want to calculate mean,std, skewness, kurtosis and covariance using one pass algorithms. The simplest and fastest one approach I found was published by Stuart McCrary from Berkeley Research Group. For example for std one may use:

std = sqrt((sum(x^2)-N*mean(X)^2)/(N-1))

I read that this approach is not good enough, as it is numerically unstable. Unfortunately, I have no deep understanding of numerical stability, but as I understand it is some problem, which happens because of limited precision of floating points operations.

In my case, I will deal only with integer numbers from 10^1-10^6 range.

May I use this approach in my case and do not take care about numerical stability?

"While the textbook method can produce accurate results most of the time, a level of uncertainty remains that perhaps a particular trial pushes into an area where the textbook method is inaccurate." The research itself does not give details of its limitations! — Bassem, Mar 26 '18 at 07:55
@BassemAkl Research not, but in plenty different places it is written, that equation above is the fastest and simplest, but suffers from numerical instability. — zlon, Mar 26 '18 at 07:58

score 0 · Answer 1 · edited May 29 '18 at 02:50

0

To improve the numerical stability, you can normalize the data. See: Wikipedia: normalization

For example: you have a data set X_1, ..., X_n, with the mean x_bar, standard deviation s. Then normalizing the data by: (X_i - x_bar) / s

edited May 29 '18 at 02:50

Stephen Rauch

47,830
31
106
135

answered May 29 '18 at 02:28

Nguyễn Thu

49
3

One pass common statistics. Numerical stability for integers

1 Answers1