Average of an uarray in python uncertainties

Question

My problem:

I have an array of ufloats (e.g. an unarray) in pythons uncertainties package. All values of the array got their own errors, and I need a funktion, that gives me the average of the array in respect to both, the error I get when calculating the mean of the nominal values and the influence the values errors have.

I have an uarray:

2 +/- 1 3 +/- 2 4 +/- 3

and need a funktion, that gives me an average value of the array.

Thanks

can you give an example of the result you would expect from that dataset? — njzk2, Apr 26 '17 at 14:44

score 3 · Answer 1 · edited Jan 31 '21 at 00:56

Assuming Gaussian statistics, the uncertainties stem from Gaussian parent distributions. In such a case, it is standard to weight the measurements (nominal values) by the inverse variance. This application to the general weighted average gives,

$$ \frac{\sum_i w_i x_i}{\sum_i w_i} = \frac{\sum_i x_i/\sigma_i^2}{\sum_i 1/\sigma_i^2} $$.

One need only perform good 'ol error propagation on this to get an uncertainty of the weighted average as,

$$ \sqrt{\sum_i \frac{1}{1/\sum_i \sigma_i^2}} $$

I don't have an n-length formula to do this syntactically speaking on hand, but here's how one could get the weighted average and its uncertainty in a simple case:

    a = un.ufloat(5, 2)
    b = un.ufloat(8, 4)
    wavg = un.ufloat((a.n/a.s**2 + b.n/b.s**2)/(1/a.s**2 + 1/b.s**2), 
                     np.sqrt(2/(1/a.s**2 + 1/b.s**2)))
    print(wavg)
    >>> 5.6+/-2.5298221281347035

As one would expect, the result tends more-so towards the value with the smaller uncertainty. This is good since a smaller uncertainty in a measurement implies that its associated nominal value is closer to the true value in the parent distribution than those with larger uncertainties.

score 1 · Answer 2 · answered Apr 26 '17 at 14:59

1

Unless I'm missing something, you could calculate the sum divided by the length of the array:

from uncertainties import unumpy, ufloat
import numpy as np
arr = np.array([ufloat(2, 1), ufloat(3, 2), ufloat(4,3)])
print(sum(arr)/len(arr))
# 3.0+/-1.2

You can also define it like this:

arr1 = unumpy.uarray([2, 3, 4], [1, 2, 3])
print(sum(arr1)/len(arr1))
# 3.0+/-1.2

uncertainties takes care of the rest.

answered Apr 26 '17 at 14:59

Eric Duminil

52,989
9
71
124

I doubt thats it, if I use this on my real data, I get an error value of +/- 0.4 while the standard error of the mean of the nominal values is around 8. – DomR Apr 26 '17 at 15:13
You might have a different error distribution. This [article](https://newton.cx/~peter/2013/04/propagating-uncertainties-the-lazy-and-absurd-way/) might interest you. – Eric Duminil Apr 26 '17 at 15:20
The problem with this is you're getting the nominal value and uncertainty of the simple sum divided by the length of entries. See my answer (coming up). – Captain Morgan Sep 15 '20 at 17:53

score 0 · Answer 3 · answered Oct 20 '22 at 09:09

I used Captain Morgan's answer to serve up some sweet Python code for a project and discovered that it needed a little extra ingredient:

    import uncertainties as un
    from un.unumpy import unp
    epsilon = unp.nominal_values(values).mean()/(1e12)
    wavg = ufloat(sum([v.n/(v.s**2+epsilon) for v in values])/sum([1/(v.s**2+epsilon) for v in values]), 
                  np.sqrt(len(values)/sum([1/(v.s**2+epsilon) for v in values])))
    if wavg.s <= np.sqrt(epsilon):
        wavg = ufloat(wavg.n, 0.0)

Without that little something (epsilon) we'd get div/0 errors from observations recorded with zero uncertainty.

score 0 · Answer 4 · answered Nov 29 '22 at 00:12

If you already have a .csv file which stores variables in 'mean+/-sted' format, you could try the code below; it works for me.

from uncertainties import ufloat_fromstr
df=pd.read_csv('Z:\compare\SL2P_PAR.csv')
for i in range(len(df.uncertainty)):
df['mean'] = ufloat_fromstr(df['uncertainty'][I]).n
df['sted'] = ufloat_fromstr(df['uncertainty'][I]).s

Average of an uarray in python uncertainties

4 Answers4