Use case: Streaming large amounts of event source data that may have inserts, updates, and deletes and has guaranteed order.
Assuming Welford's Algorithm in this form in an event stream for insert:
private double _count = 0;
private double _mean = 0;
private double _s = 0;
public void Insert(double value)
{
var prev_mean = _mean;
_count = _count + 1;
if (_count == 1)
{
_mean = value;
_s = 0;
}
else
{
_mean = _mean + (value - _mean) / _count;
_s = _s + (value - _mean) * (value - prev_mean);
}
}
public double Var => ((_count > 1) ? _s / (_count - 1) : 0.0);
public double StDev => Math.Sqrt(Var);
Would it be possible to change the online statistics given a known pre-existing value. Or would there be a more appropriate approach than Welford's Algorithm to accommodate the need?
public void Update(double previousValue, double value)
{
//I got this value correct
var prev_mean = (_count * _mean - value) / (_count - 1);
//I did the inversion, but this doesn't give the right values
var prev_s = -previousValue^2 + previousValue* prev_mean + _mean * previousValue - _mean * prev_mean + _s
}
public void Delete(double previousValue)
{
_count = _count - 1;
}
Edit
The specific questions are:
How can I calculate a correct value for _mean and _s in the case of an Update?
How can I calculate a correct value for _mean and _s in the case of an Delete?