0

I wanted to calculated 1, 2, 3 sigma error of a distribution using python. It is described in following 68–95–99.7 rule wikipedia page. So far I have written following code. Is it correct way to compute such kpi's. Thanks.

import numpy as np

# sensor and reference value
temperature_measured = np.random.rand(1000) # value from a sensor under test
temperature_reference = np.random.rand(1000) # value from a best sensor from market

# error computation
error = temperature_reference - temperature_measured
error_sigma = np.std(error)
error_mean = np.mean(error)

# kpi comutation
expected_sigma = 1 # 1 degree deviation is allowed (Customer requirement)
kpi_1_sigma = (abs(error - error_mean) < 1*expected_sigma).mean()*100.0 >= 68.27
kpi_2_sigma = (abs(error - error_mean) < 2*expected_sigma).mean()*100.0 >= 95.45
kpi_3_sigma = (abs(error - error_mean) < 3*expected_sigma).mean()*100.0 >= 99.73
BigBen
  • 46,229
  • 7
  • 24
  • 40
Oli
  • 1,313
  • 14
  • 31
  • 2
    Remember that the rule you cite assumes you have a _normal_ distribution, which surely is not true when you generate random numbers, although this won't be an issue once you have real data. – gimix Aug 11 '21 at 13:51

2 Answers2

2

I would recommend to use the definition you found in wikipedia and just calculate the percentiles, i.e., calculate the diference between:

((mu+sigma)-(mu-sigma) )/2.

sigma1 = (np.percentile(error, 50+34.1, axis=0)- np.percentile(error, 50-34.1, axis=0))/2.
sigma2 = (np.percentile(error, 50+34.1+13.6, axis=0)- np.percentile(error, 50-34.1-13.6, axis=0))/2.
sigma3 = (np.percentile(error, 50+34.1+13.6+2.1, axis=0)- np.percentile(error, 50-34.1-13.6-2.1, axis=0))/2.
  • the value of your `sigma1` is different from `np.std(error)`. Why it is like this? `np.std` is `np.sqrt(np.mean(np.abs(error - error.mean())**2))` – Oli Aug 12 '21 at 08:10
  • because your error is not a normal distribution. If you define `error` as a normal distribution with sigma=1 and many elements, both `std` and `sigma1` are the same. – Carlos Diaz Aug 23 '21 at 18:16
1

An easier way could be like so (taken from here):

NumPy's std yields the standard deviation, which is usually denoted with "sigma". To get the 2-sigma or 3-sigma ranges, you can simply multiply sigma with 2 or 3:

print [x.mean() - 3 * x.std(), x.mean() + 3 * x.std()]

result:

[-27.545797458510656, 52.315028227741429]

YoniChechik
  • 1,397
  • 16
  • 25