1

I have the below dataset:

import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt

## Given datapoints
xdata = np.array([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26])
ydata = np.array([1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0.99330715, 0.98201379, 0.95257413, 0.88079708, 0.73105858, 0.5])


## Plot the data
plt.plot(xdata, ydata)
plt.xlabel('x')
plt.ylabel('sigmoid(x)')
plt.xlim([-1,31])
plt.ylim(0, 1.05)
plt.show()

The above data looks as such:

enter image description here

A curve needs to be caliberated and extrapolated for y decreasing from 1 to 0 by using curve_fit in python.

I am trying to use sigmoid function provided that 'y' is given and 'x' need to be found.

y=1/(1+e^(k(x-x_0)) )

The sigmoid function to fit 'x' is thus defined as such:

## Define sigmoid function to fit xdata
def sigmoid(y, x0, k):
    x = x0 + ((1/k)*(np.log((1/y)-1)))
    return x

## Initial guess
p0 = [np.median(xdata),       # x0
      0.1]                    # k

## Initialize curve fit
popt, pcov = curve_fit(sigmoid,
                       ydata,
                       xdata)


## Define values for y
y = np.arange(1,0,-0.001)

## Evaluate values for x
x = sigmoid(y, *popt)

## Plot tbe actual and fit data
plt.plot(xdata, ydata, 'o', label='data')
plt.plot(x,y, label='fit')
plt.xlim([-10,31])
plt.ylim(0, 1.05)
plt.legend(loc='best')
plt.show()

The fit data looks as such:

enter image description here

It is quite visible that the fit is not good.

Can somebody please let me know how I can fit a curve close to actual data?

NN_Developer
  • 417
  • 6

2 Answers2

3

The hitch is that in the equation to be fitted there is ln((1/y)-1) which is infinite for y=1. The points where y=1 must not be taken into account. Then the fitting is very good :

For y tending to 1 (but lower than 1) then x tends to -infinity which is consistent with the data and the shape of the curve.

JJacquelin
  • 1,529
  • 1
  • 9
  • 11
  • Could you please let me know the python codes that can be applied to achieve this task? – NN_Developer Mar 23 '23 at 13:47
  • 1
    Sorry I don't have Python in front of me. I did it with another software. This is a simple linear regression to compute the two parameters x0 and (1/k). – JJacquelin Mar 23 '23 at 15:25
2

You might find lmfit useful for this (disclaimer: I am a lead author), as it has sigmoidal step functions built in and can readily do fits and use those results to interpolate or extrapolate. It also gives more useful reports of results than curve_fit, with variable parameters that have meaningful names and uncertainties and correlations correctly calculated, sorted and assigned.

Your example might look like this:

import numpy as np
import matplotlib.pyplot as plt

from lmfit.models import StepModel, ConstantModel

#
xdata = np.array([0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26])
ydata = np.array([1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0.99330715, 0.98201379, 0.95257413, 0.88079708, 0.73105858, 0.5])

# create a logistic step function + an offset constant
model = StepModel(form='logistic') + ConstantModel()

# make a set of parameters with initial values
params = model.make_params(c=1, amplitude=-1, center=25, sigma=2.5)

# fit the data with the parameters and `x` independent variable
result = model.fit(ydata, params, x=xdata)

# print results
print(result.fit_report())

# evaluate the model with best-fit parameters to 
# interpolate and extrapolate to higher x values
xnew = np.linspace(15, 45, 121)
ynew = model.eval(result.params, x=xnew)


## Plot the data, best-fit result, and extrapolated data
plt.plot(xdata, ydata, 'o', label='data')
plt.plot(xdata, result.best_fit, '-', label='best fit')
plt.plot(xnew,  ynew, '+', label='exptrapolated')
plt.xlabel('x')
plt.ylabel('sigmoid(x)')
plt.xlim([-1,41])
plt.ylim(0, 1.05)
plt.legend()
plt.show()

which would print a report of

[[Model]]
    (Model(step, form='logistic') + Model(constant))
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 82
    # data points      = 27
    # variables        = 4
    chi-square         = 6.4177e-06
    reduced chi-square = 2.7903e-07
    Akaike info crit   = -403.811763
    Bayesian info crit = -398.628416
    R-squared          = 0.99997896
[[Variables]]
    amplitude: -0.99669011 +/- 0.01320863 (1.33%) (init = -1)
    center:     25.9928183 +/- 0.02578216 (0.10%) (init = 25)
    sigma:      0.99867632 +/- 0.00629747 (0.63%) (init = 2.5)
    c:          1.00015992 +/- 1.1486e-04 (0.01%) (init = 1)
[[Correlations]] (unreported correlations are < 0.100)
    C(amplitude, center) = -0.997
    C(center, sigma)     = 0.954
    C(amplitude, sigma)  = -0.943
    C(sigma, c)          = 0.287
    C(amplitude, c)      = -0.223

and make a plot like this: plot of data, best-fit, and extrapolation with logistic step function

M Newville
  • 7,486
  • 2
  • 16
  • 29