2

I have a scatter plot with only 5 data points, which I would like to fit a curve to. I have tried both polyfit and the following code, but neither are able to produce a curve with so few data points

def func(x, a, b, c):
 return a * np.exp(-b * x) + c
plt.plot(xdata, ydata, ".", label="Data");
optimizedParameters, pcov = opt.curve_fit(func, xdata, ydata);
plt.plot(xdata, func(xdata, *optimizedParameters), label="fit");

Attached is an example of the plot, along with an example of the kind of curve I am trying to produce (apologies for the bad drawing). Thanks!

enter image description here

SpghttCd
  • 10,510
  • 2
  • 20
  • 25
qwerty
  • 105
  • 2
  • 8
  • 2
    Imo you should add your imports and the points to provide a [mcve]. – SpghttCd Sep 17 '19 at 20:13
  • Apologies, I am using pandas but it would also work with 2 arrays of x= (7e-09, 9e-09, 1e-08, 2e-8, 1e-6) and y=(790, 870, 2400, 2450, 3100). I don't think I can edit the question to include this. – qwerty Sep 17 '19 at 21:05
  • Note, your image suggests an exponential regression on a semilog plot (see the x-axis). I imagine this is not what you want. You likely want a simple regression on normal axes, yes? – pylang Sep 17 '19 at 22:05

3 Answers3

4

Here is an example graphical Python fitter using the data in your comment, fitting to a Polytrope type of equation. In this example there is no need to take logs of the data. Here the X axis is plotted on a decade logarithmic scale. Please note that the data in the example code is in the form of floating point numbers.

plot

import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

xData = numpy.array([7e-09, 9e-09, 1e-08, 2e-8, 1e-6])
yData = numpy.array([790.0, 870.0, 2400.0, 2450.0, 3100.0])


def func(x, a, b, offset): # polytrope equation from zunzun.com
    return  a / numpy.power(x, b) + offset


# these are the same as the scipy defaults
initialParameters = numpy.array([1.0, 1.0, 1.0])

# curve fit the test data
fittedParameters, pcov = curve_fit(func, xData, yData, initialParameters)

modelPredictions = func(xData, *fittedParameters) 

absError = modelPredictions - yData

SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))

print('Parameters:', fittedParameters)
print('RMSE:', RMSE)
print('R-squared:', Rsquared)

print()


##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
    f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
    axes = f.add_subplot(111)

    # first the raw data as a scatter plot
    axes.plot(xData, yData,  'D')

    # create data for the fitted equation plot
    xModel = numpy.linspace(min(xData), max(xData), 1000)
    yModel = func(xModel, *fittedParameters)

    # now the model as a line plot
    axes.plot(xModel, yModel)

    axes.set_xlabel('X Data') # X axis data label
    axes.set_ylabel('Y Data') # Y axis data label

    plt.xscale('log') # comment this out for default linear scaling


    plt.show()
    plt.close('all') # clean up after using pyplot

graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)
James Phillips
  • 4,526
  • 3
  • 13
  • 11
  • This answer is equally valid since we cannot determine from so few points wheter or not the best fit graph should have an asymptote or is simply always growing but decelerating (which is a feature of logarithms). – Allan Lago Sep 17 '19 at 23:01
  • Nice. Do you have a link to your polytrope equation? – pylang Sep 18 '19 at 00:44
  • @pylang my zunzun.com open source Python web site's fitting interface for this equation is at http://zunzun.com/Equation/2/Miscellaneous/Polytrope%20With%20Offset/ and I used the site's "function finder" for equation search on the provided data. The web site uses the Differential Evolution genetic algorithm to determine initial parameter estimates for fitting non-linear equations, which is what makes such equation searches possible. – James Phillips Sep 18 '19 at 01:42
  • Thanks. Nice site. I mean resources that describe y = a / xb + c as a "polytrope". Do you have any sources? – pylang Sep 18 '19 at 04:05
  • @pylang My original research link from years ago is in my source code as http://planetmath.org/encyclopedia/Polytrope.html but this link seems to be invalid now. Please note that the equation has x raised to the power b, which I do not see in your comment. – James Phillips Sep 18 '19 at 11:25
  • Thanks. Since it is hard to cite, perhaps add a few comments on your polytrope. It seems like a specific power law relationship and seems to differ from this [polytrope](https://en.wikipedia.org/wiki/Polytrope) definition. – pylang Sep 18 '19 at 17:23
  • @pylang Yes, that astrophysics-specific link does not have the equation I used. It does not seem a good idea to use that astrophysics-specific link on my web site, as the equation is different than the one I have. – James Phillips Sep 18 '19 at 18:30
3

You would have to choose what you want to fit the curver after. From the look of your drawing it seems you are trying to shape it to be somewhat logarithmic.

Here's a picture of a logarithmic regression:

enter image description here

A logarithmmic regression would follow the form of y = A + B ln(x). This is essentially a linear regression fitting were instead of fitting y vs. x we are trying to fit y vs. ln(x).

So you may just take the natural log of the x-values of the points in your dataset and execute a linear regression algorithm on it. The yielding coefficients are then A and B for y=A + B ln(x).

Picture Credits: http://mathworld.wolfram.com/LeastSquaresFittingLogarithmic.html

Edit: As James Phillips pointed out in his answer, it is also possible to model the curve in the form of y=Ax^(-B) + C since for so few points it cannot be determined wheter the graph has an horizontal asymptote or is always growing but decelerating. A lot of curves are possible (for instance y=A* B^(-x) +C could be another one) but you would need to choose what to model the data after.

Allan Lago
  • 309
  • 2
  • 14
  • Please see my answer to this question using a Polytrope type equation. – James Phillips Sep 17 '19 at 22:36
  • Based on my analysis, the assertion in your edit that "y=A* B^(-x) +C" would be a candidate fitting equation for the provided data is incorrect. Please provide fitted parameters for that equation. – James Phillips Sep 18 '19 at 01:35
  • @JamesPhillips I meant it was a possible as in it could be another model. Corrected it for clarity. Though, what exactly told you that this cannot be a fitting model? – Allan Lago Sep 18 '19 at 01:53
  • 1
    In my analysis, that equation yielded a very poor fit to the data - which is why I asked for your fitted parameters. – James Phillips Sep 18 '19 at 01:56
2

The exponential function does not fit your data well. Consider another modeling function.

Given

 import numpy as np
 import scipy.optimize as opt
 import matplotlib.pyplot as plt


%matplotlib inline


x_samp = np.array([7e-09, 9e-09, 1e-08, 2e-8, 1e-6])
y_samp = np.array([790, 870, 2400, 2450, 3100])


def func(x, a, b):
    """Return a exponential result."""
    return a + b*np.log(x)


def func2(x, a, b, c):
    """Return a 'power law' result."""
    return a/np.power(x, b) + c

Code

From @Allan Lago's logarithmic model:

# REGRESSION ------------------------------------------------------------------
x_lin = np.linspace(x_samp.min(), x_samp.max(), 50)

w, _ = opt.curve_fit(func, x_samp, y_samp)     
print("Estimated Parameters", w)  

# Model
y_model = func(x_lin, *w)


# PLOT ------------------------------------------------------------------------
# Visualize data and fitted curves
plt.plot(x_samp, y_samp, "ko", label="Data")
plt.plot(x_lin, y_model, "k--", label="Fit")
plt.xticks(np.arange(0, x_samp.max(), x_samp.max()/2))
plt.title("Least squares regression")
plt.legend(loc="upper left")

Estimated Parameters [8339.61062739  367.6992259 ]

enter image description here

Using @James Phillips' "Polytrope" model:

# REGRESSION ------------------------------------------------------------------
p0 = [1, 1, 1]
w, _ = opt.curve_fit(func2, x_samp, y_samp, p0=p0)     
print("Estimated Parameters", w)  

# Model
y_model = func2(x_lin, *w)


# PLOT ------------------------------------------------------------------------
# Visualize data and fitted curves
plt.plot(x_samp, y_samp, "ko", label="Data")
plt.plot(x_lin, y_model, "k--", label="Fit")
plt.xticks(np.arange(0, x_samp.max(), x_samp.max()/2))
plt.title("Least squares regression")
plt.legend()

Estimated Parameters [-3.49305043e-10  1.57259788e+00  3.05801283e+03]

enter image description here

pylang
  • 40,867
  • 14
  • 129
  • 121