For a series of experiments measured at different time points, I'm trying to compare fits that fixed parameters for all experiments with fits that have an individual set of parameters per experiment, using an information criterion (like Akaike IC). Is that possible with scipy.optimize.least_squares
?
Let's say that the measurements roughly follow sigmoid curves in time with parameters base
(level before step), height
(increase in level after step), stretch
(duration of the step) and t50
(the time where the level is base + height/2
). I model these in the code below, which returns a size [t, c]
numpy ndarray measurements
with t
time points (experiments) and c
values per time point:
import numpy as np;
import matplotlib.pyplot as plt;
# lifted, scaled, stretched and shifted sigmoid
def lssig ( t, base, height, stretch, t50 ):
return base + height / ( 1 + np.exp ( -1/stretch * ( t - t50 ) ) );
# number of curves per experiment
numcurves = 10;
# number of experiments (time pts)
num_experiments = 20;
# timepoints (with simulated variability in timing)
time_jitter = 2;
tstart = np.linspace ( -10, 30, num = num_experiments );
tstart += time_jitter * np.random.rand ( num_experiments );
# measurements + added noise (this is added noise, not the variability in the estimates)
measurements = np.ndarray ( shape = ( num_experiments, numcurves ) );
measure_noise = np.random.normal ( 0, .2, ( num_experiments, numcurves ) );
# Variability of model parameters between curves:
# par_spreading [0] -> variability in base (start value)
# par_spreading [1] -> variability in height (end - start)
# par_spreading [2] -> variability in stretch (duration of step)
# par_spreading [3] -> variability in t50 (middle of step)
par_spreading = np.ones ( 4 );
# parameters of the model:
# every curve can have its own parameters, variability between curves
# is given by the par_spreading value (see above) for that parameter
height = 8 * np.ones ( numcurves ) + par_spreading [0] * np.random.rand ( numcurves);
base = 2 * np.ones ( numcurves ) + par_spreading [1] * np.random.rand ( numcurves);
stretch = 4 * np.ones ( numcurves ) + par_spreading [2] * np.random.rand ( numcurves);
t50 = 9 * np.ones ( numcurves ) + par_spreading [2] * np.random.rand ( numcurves);
# fill the measurement array
for t in range(num_experiments):
for c in range (numcurves):
measurements[ t, c ] = lssig ( tstart [t],
base [c],
height [c],
stretch [c],
t50 [c],
);
measurements += measure_noise;
# plot curves per subject
plt.plot ( tstart, measurements );
plt.show ();
There is a very nice page with examples of using least_squares, where 1 sigmoid is fitted, and the parameters of the sigmoid are passed in an array theta
. Using that recipe I can fit the curve of the 1st experiment at every time point measurements[:,0]
:
from scipy.optimize import least_squares;
# point on the sigmoid, using array theta for parameters
def y ( theta, t ):
return theta[0] + theta[1] / ( 1 + np.exp ( -1/theta[2] * ( t - theta[3] ) ) );
# objective function for least_squares: difference between noisy data & model
def fun(theta):
return y(theta, tstart) - measurements[:,0];
# start with values for base, step, stretch and t50
theta0 = [8,2,4,9];
res1 = least_squares(fun, theta0);
# show the parameters & plot the functions
print ( 'real: {}, estimated: {}'.format ( [ base[0], height[0], stretch[0], t50[0] ], res1.x ) );
#
plt.plot ( tstart, measurements[:,0], tstart, y(res1.x,tstart) );
plt.legend ( ('noisy', 'lsq' ) );
plt.show ( );
The reason that I used the 4 separate parameters in the code above is that base
can either be a single scalar (same for all curves per experiment) or an array (each curve has an individual base
value). Is it possible to use least_squares this way? Or do all the parameters need to be put in on 1D array?
It is possible to flatten the array of parameters (like with theta
) but that becomes incredibly messy because in the model where every experiment has its own base
parameter, the second parameter would be a value for base
, but in the model where there is one base
parameter for all experiments, the second value in the array would be a value for height
, and so on.
Ideally the parameters would still be separated, so that their length could vary independently. Is there an administratively traceable / aesthetically pleasing way to do this?