1

Hello I am trying to learn how to properly use lmfit and I think I am calculating the fit errors wrong. I have some data with errors on y and when I do the fit I call this (I tried for a simple linear fit):

weight = 1/err
out = line_fit.fit(y, pars, x=x, weights = weight)

I assumed that this would calculate the chi square and use the errors in the denominator. However it seems to not work properly. The fit looks good and I am getting a reasonable value for the errors, but if I increase the errors on purpose like err = 50*err, I am getting the exactly some fit parameters. But obviously the errors on the parameters now should be much bigger (by the propagation of error formula), but they are exactly the same. What am I doing wrong?

A second questions is, if I have errors on the x axis, how can I include that in the fit? There is just one weight parameter in the function call.

Thank you!

JohnDoe122
  • 638
  • 9
  • 23

1 Answers1

1

It is a deliberate feature of lmfit.Model (and lmfit in general) to rescale the uncertainties so that they reflect a "good fit". Instead of parameter uncertainties that increase chi-square by 1, it reports parameter uncertainties that increase chi-square by reduced chi-square. This means that changing the scale of the uncertainties or fit weights will change the value of the fit statistics but not the reported uncertainties on the parameter values.

If you do want the parameter uncertainties to be those that increase chi-square by 1, use Model.fit(ydata, params, ..., scale_covar=False)

Uncertainties in x or any independent data are challenging to include in any automated way. The fit does not actually use the values of the independent data except to inform the model function about how to calculate the model (the y values) to compare to the provided data. You might consider

  • increase the uncertainty of each y value based on the change in that y value that would happen in response to the uncertainty in x, but there is not an automated way to this with the Model framework -- you would need to do this within your model function.

  • look into scipy.ODR which can handle uncertainties in both the dependent and independent data. This is not supported in lmfit, but you might find it useful.

M Newville
  • 7,486
  • 2
  • 16
  • 29
  • Thank you so much for your reply, this is really helpful. I am quite new to statistics, so this might be a silly question, but when should I use scale_covar=False? In other words, what values should I quote for my errors? I read a few experimental paper and I didn't really see any mention of this kind, so is there a generally accepted way? – JohnDoe122 Dec 12 '19 at 03:54
  • use `scale_covar=False` when you are more confident that your error bars are the correct size than you are that the fit is "good". That is, one can use reduced chi-square to assess whether a fit is good (it should be ~1 for a good fit), but that depends on the scale of the uncertainties. Using `scale_covar=True` is effectively saying "adjust the data uncertainties so that reduce chi-square would be 1 and report those parameter uncertainties". That is, if you're bothering to report parameter uncertainties you probably believe the fit is good. – M Newville Dec 12 '19 at 22:31
  • Thank you again! I understand now what each of them mean. I am still not 100% sure I would be able to figure out in practice which one to use. For example, in my case, I need to fit a straight line to some counting data. The errors are Poisson (so I am pretty sure about them) and the fit looks good. If I use the default scale_covar=True I get an error (25 +/- 1) twice as big compared to when I use scale_covar=False (25 +/- 0.5). Based on your explanation, given that I am confident about my fit, I should use the one with +/- 1. Would this be the right thing to do? – JohnDoe122 Dec 14 '19 at 02:08
  • One more thing, in scientific publications, I don't think I've ever seen anyone stating which of these 2 ways of computing the errors on a fit they use. Based on "if you're bothering to report parameter uncertainties you probably believe the fit is good. " I assume that they use something equivalent to scale_covar=True. Does this make sense? Thank you! – JohnDoe122 Dec 14 '19 at 02:09
  • @BillKet Which approach you take is mostly up to you. Of course, you'd want to be consistent. The `scale_covar` approach is fairly common in some disciplines, but you might want to check with others in your field about it. FWIW I think most people really only believe the first significant digit (or even "order of magnitude") of an error bar. If you say "25+/-1", I would not think 27 is impossible. For sure some fields in high-energy physics are much more rigorous about error budgets, but if you're asking on Stackoverflow, that probably does not apply to you ;) – M Newville Dec 14 '19 at 03:20