Python curve fit optimize using relative deviation instead of absolute deviation

Question

I am fitting curve using the scipy.optimize.curve_fit. From what I notice, the curve fitting is performed by minimizing the sum of the squared residuals of f(xdata, *popt) - ydata, whereas I want to minimize the squared residuals of relative error: (f(xdata, *popt) - ydata)/ydata since my ydata order of magnitude varies a lot. How to optimize using the relative deviation? I do not need to necessarily use curve_fit function. Any python function to achieve this is fine.

PS: I am aware of another approach of converting the ydata into logspace and fitting the resulting data. But I do not want to do this approach.

@jezrael I feel this is still more like a python-specific programming question and probably wider reach here. — SKPS, Mar 29 '21 at 06:03
OK, then unfortunately I have no idea how should be solved it :( — jezrael, Mar 29 '21 at 06:04

JJacquelin · Answer 1 · 2021-03-29T07:14:28.743

3

I suppose that it is related to the previous question scipy curve_fit coefficient does not align with expected value (physics relevant?)

Instead of importing ydata, import a new numerical file made of log(ydata).

And replace the function f(xdata) by a new function log(f(xdata)).

This is equvalent to change the criteria of fitting from LMSE to LMSRE.

edited Mar 29 '21 at 07:14

answered Mar 29 '21 at 07:04

JJacquelin

1,529
1
9
11

Converting the data to log is not an equivalent of doing LMSRE. It is basically bringing down the data range closer. As I told in the question, I’m not looking for converting data to log space. I really wonder if this relative deviation approach is already in-built into these functions. – SKPS Mar 29 '21 at 07:22
Try it and see. – JJacquelin Mar 29 '21 at 07:24
What you suggested would definitely improve the fit. But what I asked is different. – SKPS Mar 29 '21 at 07:27
2

@SKPS Well, this is what log-space does, relative error. If you have at point `x` the theoretical value of `a` and measured `a + sa`, the difference for the least squares is `( a + sa ) - a = sa`. In log-space you have `log(a + sa) - log( a ) = log( 1 + sa/a )`. If you do not like this, you may use `least_squares` instead and weight the data with the inverse theoretical value as done [here](https://stackoverflow.com/a/66689232/803359) – mikuszefski Mar 29 '21 at 09:01
@mikuszefski : You are right. Weighting with 1/y is also an equivalent method. Pertinent comment (+1). – JJacquelin Mar 29 '21 at 10:52
@mikuszefski Thanks a lot for taking time and explaining things. If you can post this as answer, I will accept it. Thanks! – SKPS Mar 29 '21 at 14:46
@JJacquelin Sorry I did not see this connection earlier. Now it looks clear. – SKPS Mar 29 '21 at 14:47

mikuszefski · Accepted Answer · 2021-04-01T06:58:59.737

An error weighting with the inverse y data is equivalent to a transformation into log-space. This is for the following reason: In a standard least square fit the data y deviates from the "true" function value yt by the error sa. The least-square fit minimizes the sum of y - yt = sa over all data. When going to log-space this reads log(y) - log(yt) = log( yt + sa) - log( yt ) = log( 1 + sa / yt ) = log( y / yt ) using the standard rules for logarithms.

This gives, hence, the relative error as also pointed out by JJacquelin .

BTW, as the least square makes the square, of course, this is also ( log( y ) - log( yt ) )**2 = ( -log( y ) + log( yt ) )**2 = log( yt / y )**2

Couple of things: You can also add the inverse weights as alternative option for log-space conversion. Please balance the parentheses in the last line. — SKPS, Mar 31 '21 at 02:14

Python curve fit optimize using relative deviation instead of absolute deviation

2 Answers2