why multiply covariance matrix by 2 in lmfit error calculation

Question

When I looked into how the error calculation is done for lmfit when we fit functions, I found that the covariance matrix is calculated as the inverse of Hessian Matrix * 2. Then, the error on each parameter is calculated as the sqrt of the covariance matrix (after some scaling by reduced chi2). Where does this factor of two come from? Line 763 of this source is where I am referring to. I looked for some explanations but have not been able to find anything that explains this online.

Thank you for your help!

score 0 · Answer 1 · answered Mar 12 '23 at 19:04

0

This is because that function is trying to calculate uncertainties for minimization objective functions that return sum-of-squares-of-residuals instead of an array of residuals (of ndata points). The Jacobian (as used with least_squares at https://github.com/lmfit/lmfit-py/blob/d2ba8e9aaefa99cf144c65aaaf13671989f2d681/lmfit/minimizer.py#L1620 or as done internally with leastsq) cannot be used because those ndata points have been summed to 1. So the Hessian is calculated directly, but it is using residual**2, not residual. That squaring is the origin of the factor of 2.

Another way to look at it would be that it makes the estimated uncertainties match well with those from the covariance matrix from the leastsq (and least_squares) method. If this is not done, the estimated uncertainties are systematically small by about the square root of 2.

answered Mar 12 '23 at 19:04

M Newville

7,486
2
16
29

Thank you very much for the clarification. From what I understand, this factor of two has to be compensated somehow to compute the right error when we use our own loss function, then? If we don't do this, we will have an error multiplied by sqrt(2) as the reported error. – Sara S Mar 13 '23 at 21:10
I think it *might* need to be modified when supplying your own loss function. Many solvers return a loss of "residual array" to be minimized in the least-squares sense. Scaler solvers that use the code you reference are assumed to return "sum of squares of residual" (the "square" essentially giving the 2). If you are using this scalar solver but returning "sum of absolute values of residuals", then I do think the uncertainties would be overestimated by sqrt(2). If you are using a more exotic loss function, I think it will depend on the details of the loss function. ;) – M Newville Mar 13 '23 at 21:30
Great. Thank you! I wanted to use the poison loss function, which is sum of (model - data*Log(model) + const) So I believe I need to rescale the error found by 1/sqrt(2) – Sara S Mar 13 '23 at 22:35
Yeah, or maybe even better (because solvers that get the un-summed residual array are faster and often more robust) use the `leastsq` method and just weight your residual by `log(model)`. That is, minimize `(data-model)*log(model)` in the least-squares sense. – M Newville Mar 14 '23 at 01:42

why multiply covariance matrix by 2 in lmfit error calculation

1 Answers1