17

To perform a fit, I am currently using the curve_fit from scipy.optimize.

I have calculated the error associated with each of my ydata and I would like to add the calculated sigma = y_errors present in the data to the fit,

i.e. minimising sum( ((f(xdata, *popt) - ydata) / sigma)**2 ) instead of just sum( (f(xdata, *popt) - ydata)).

I can see that it can be done assigning the parameter sigma in the documentation. What I don't clearly understand is the absolute_sigma parameter. The explanation given in the documentation is quite confusing to me.

Should I set absolute_sigma= = True ? Or should it be set to False if I need to consider the y_errors associated with each of my ydata?

Srivatsan
  • 9,225
  • 13
  • 58
  • 83

1 Answers1

9

If you have absolute uncertainties in your data, i.e. if the units of your y_errors are the same as units of your ydata, then you should set absolute_sigma= = True. However it is often the case that the units of y_errors aren't known precisely, only the relative magnitude is known. An example of the latter case could be where some of the y values come from repeated measurements at the same x value. Then it would make sense to weight the repeated y values twice as much as the non-repeated y values, but the units of this weight (2) are not the same as whatever the units of y are.

Here's some code to illustrate the difference:

import numpy as np
from scipy.optimize import curve_fit
from scipy.stats import norm

# defining a model
def model(x, a, b):
    return a * np.exp(-b * x)

# defining the x vector and the real value of some parameters
x_vector = np.arange(100)
a_real, b_real = 1, 0.05

# some toy data with multiplicative uncertainty
y_vector = model(x_vector, a_real, b_real) * (1 + norm.rvs(scale=0.08, size=100))

# fit the parameters, equal weighting on all data points
params, cov = curve_fit(model, x_vector, y_vector )
print params
print cov


# fit the parameters, weighting each data point by its inverse value
params, cov = curve_fit(model, x_vector, y_vector, 
                        sigma=1/y_vector, absolute_sigma=False)
print params
print cov

# with absolute_sigma=False:
## multiplicative transformations of y_data don't matter
params, cov = curve_fit(model, x_vector, y_vector, 
                        sigma=100/y_vector, absolute_sigma=False)
print params
print cov

# but absolute_sigma=True:
## multiplicative transformations of sigma carry through to pcov
params, cov = curve_fit(model, x_vector, y_vector,
                        sigma=100/y_vector, absolute_sigma=True)
print params
print cov

[ 1.03190409  0.05093425]
[[  1.15344847e-03   5.70001955e-05]
 [  5.70001955e-05   5.92595318e-06]]

[ 1.0134898   0.04872328]
[[  1.57940876e-04   1.56490218e-05]
 [  1.56490218e-05   3.56159680e-06]]

[ 1.0134898   0.04872328]
[[  1.57940878e-04   1.56490220e-05]
 [  1.56490220e-05   3.56159682e-06]]

[ 1.0134898   0.04872328]
[[ 2978.10865352   295.07552766]
 [  295.07552766    67.15691613]]
Curt F.
  • 4,690
  • 2
  • 22
  • 39
  • The way I get my yerrors are using jackknife resampling method. So I calculate the same quantity, but each time omitting a subsample from my data. So I believe in my case I know that the yerrors have the same units as my data. Is that correct? – Srivatsan Jul 29 '15 at 16:43
  • I'm not 100% sure...its tough to say without a detailed description of your data and algorithms...apologies. – Curt F. Jul 30 '15 at 00:58