I am using Python 3.6 for data fitting. Recently, I came across the following problem and I’m lacking experience wherefore I’m not sure how to deal with this.
If I use numpy.polyfit(x, y, 1, cov=True) and scipy.curve_fit(lambda: x, a, b: a*x+b, x, y) on the same set of data points, I get nearly the same coefficients a and b. But the values of the covariance matrix of scipy.curve_fit are roughly half of the values from numpy.polyfit.
Since I want to use the diagonal of the covariance matrix to estimate the uncertainties (u = numpy.sqrt(numpy.diag(cov))) of the coefficients, I have three questions:
- Which covariance matrix is the right one (Which one should I use)?
- Why is there a difference?
- What does it need to make them equal?
Thanks!
Edit:
import numpy as np
import scipy.optimize as sc
data = np.array([[1,2,3,4,5,6,7],[1.1,1.9,3.2,4.3,4.8,6.0,7.3]]).T
x=data[:,0]
y=data[:,1]
A=np.polyfit(x,y,1, cov=True)
print('Polyfit:', np.diag(A[1]))
B=sc.curve_fit(lambda x,a,b: a*x+b, x, y)
print('Curve_Fit:', np.diag(B[1]))
If I use the statsmodels.api
, the result corresponds to that of curve_fit.