Scipy curve_fit seams to be working badly. Which python libary for fitting curves is more stable?

Question

I took the example of the official documantation page of scipy optimize curve_fit.(https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html) and modified the function in the example a little bit and scipy throws a warning "Covariance of the parameters could not be estimated" and gives me a bad fit. In my opion the fit of such a not really special function should be work fine, so that curve_fit seem to be working badly or I am missing some points and had a bad start? Could someone give me a hint what is the problem or which libary I could use instead.

from scipy.optimize import curve_fit

def func(x, a, b, c):
    return a - b*np.exp(-c*x)

xdata = np.linspace(0, 4, 50)
y = func(xdata, 823.5, 5.3, 8.2)
rng = np.random.default_rng()
y_noise = 0.2 * rng.normal(size=xdata.size)
ydata = y + y_noise
plt.plot(xdata, ydata, 'b-', label='data')

popt, pcov = curve_fit(func, xdata, ydata)
plt.plot(xdata, func(xdata, *popt), 'r-', label='fit: a=%5.3f, b=%5.3f, c=%5.3f' % tuple(popt))

plt.plot(xdata, y, 'g--', label='Original')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()```

As always....it's the starting values. `curve_fit` assumes, if not given otherwise, values of order `1`. Now `a` is obviously off, but the `chi^2` can also be decreased by increasing `c`. This is exactly what the algorithm does. Only in the end (I guess) `a` goes up to improve even further (probably happening somewhat parallel). Nevertheless, at that point the solution is so off from the true global minimum that "convergence" stops in a somewhat badly defined position in parameter space; probably not a parabolic minimum. Give reasonable good starting values and it works. — mikuszefski, Aug 24 '21 at 11:40
Hi Mikus, thank you for your reply. I tried instead of ```a``` (```a+820```') and also instead of ```xdata = np.linspace(0, 4, 50)``` (```xdata = np.linspace(0.1, 4, 50)```). I think I did not understand your point? What do you mean by starting point and the order of a value? The code above is complete to simulate the issue. Could you please fix the code above, by what you mean? — Jonny Joker, Aug 25 '21 at 09:27
The solution is `popt, pcov = curve_fit( func, xdata, ydata, p0=( 800, 10, 1) )`. Note, a linear fit is just the solution to a set of linear equations. For non-linear fits one may do this iteratively by linearizing the equation at some set of parameters and continue with the solution. This needs, however, some point to start with. Non-linear equations can have local minima in the chi^2 surface, though, and the gradient of your starting point may point in this direction. Or, more like your case, the gradient points away from the global minimum but into a direction where no true minimum exists. — mikuszefski, Aug 25 '21 at 11:04
Note that your equation can be fitted in a linear way using the techniques presented here: https://de.scribd.com/doc/14674814/Regressions-et-equations-integrales You don't need much french to get the point. — mikuszefski, Aug 25 '21 at 11:05
Concerning my point above, consider the following 1D example. Say you want to find the minimum of `1 / ( x^2 +1 ) - 2 / ( x^2 + 1 )^2` by following the negative gradient (Newton method). If your search starts, e.g., at `x0 = 1.8` you will be lead into positive x-direction forever without finding the global minimum at `x=0` — mikuszefski, Aug 25 '21 at 11:42
Hi Mikus, thank you for solving the issue and bringing some light in the darkness. The site [link](https://scikit-guess.readthedocs.io/en/sine/_downloads/4b4ed1e691ff195be3ca73879a674234/Regressions-et-equations-integrales.pdf) was easier to acess the paper for me. — Jonny Joker, Aug 25 '21 at 12:53
In your solution, one has to give curve_fit a function as starting point, which is somehow an good approximation to the wanted solution. For this artificial example it is of course easy to find. In a real world problem one often don't know a good starting function (```a=a_0, b=b_0, c=c_0```), so why is curve_fit not able to find it by itself from the data? Or how would you get the starting point from the data, without doing numerics without using pen and paper? — Jonny Joker, Aug 25 '21 at 12:53
One thing that remains is: computers are dumb. The possibilities with non-linear functions are so vast, that you cannot expect one rather simple algorithm to get all cases. There are more sophisticated algorithms to find a global minimum, like differential evolution (which actually has a python implementation) but at the end, I'd say, you should always try to apply as much knowledge you have on the problem and data as possible before fitting. And yes, in a lot of cases that actually means using pen and paper. — mikuszefski, Aug 25 '21 at 13:02
Thank you very much for your help. So, I will still keep my pen and save some pages of white paper ;-). — Jonny Joker, Aug 25 '21 at 13:11

score 0 · Answer 1 · answered Aug 25 '21 at 13:17

0

The question was solved by Mikus Szefski (see the comments). The solution is popt, pcov = curve_fit( func, xdata, ydata, p0=( 800, 10, 1) ). Here, p0 is somehow an approximation to the wanted solution, which has to be find by yourself before fitting.

answered Aug 25 '21 at 13:17

Jonny Joker

63
6

Scipy curve_fit seams to be working badly. Which python libary for fitting curves is more stable?

1 Answers1