Python3, scipy.optimize: Fit model to multiple datas sets

Question

I have a model which is defined as:

m(x,z) = C1*x^2*sin(z)+C2*x^3*cos(z)

I have multiple data sets for different z (z=1, z=2, z=3), in which they give me m(x,z) as a function of x.

The parameters C1 and C2 have to be the same for all z values.

So I have to fit my model to the three data sets simultaneously otherwise I will have different values of C1 and C2 for different values of z.

It this possible to do with scipy.optimize. I can do it for just one value of z, but can't figure out how to do it for all z's.

For one z I just write this:

def my_function(x,C1,C1):
    z=1
return C1*x**2*np.sin(z)+ C2*x**3*np.cos(z)


data = 'some/path/for/data/z=1'

x= data[:,0]
y= data[:,1]

from lmfit import Model

gmodel = Model(my_function)
result = gmodel.fit(y, x=x, C1=1.1)

print(result.fit_report())

How can I do it for multiple set of datas (i.e different z values?)

score 0 · Accepted Answer · answered Oct 05 '19 at 20:48

So what you want to do is fit a multi-dimensional fit (2-D in your case) to your data; that way for the entire data set you get a single set of C parameters that bests describes your data. I think the best way to do this is using scipy.optimize.curve_fit().

So your code would look something like this:

import scipy.optimize as optimize
import numpy as np

def my_function(xz, *par):
    """ Here xz is a 2D array, so in the form [x, z] using your variables, and *par is an array of arguments (C1, C2) in your case """
    x = xz[:,0]
    z = xz[:,1]
    return par[0] * x**2 * np.sin(z) + par[1] * x**3 * np.cos(z)

# generate fake data. You will presumable have this already
x = np.linspace(0, 10, 100)
z = np.linspace(0, 3, 100)
xx, zz = np.meshgrid(x, z) 
xz = np.array([xx.flatten(), zz.flatten()]).T
fakeDataCoefficients = [4, 6.5]
fakeData = my_function(xz, *fakeDataCoefficients) + np.random.uniform(-0.5, 0.5, xx.size)

# Fit the fake data and return the set of coefficients that jointly fit the x and z
# points (and will hopefully be the same as the fakeDataCoefficients
popt, _ = optimize.curve_fit(my_function, xz, fakeData, p0=fakeDataCoefficients)

# Print the results
print(popt)

When I do this fit I get precisely the fakeDataCoefficients I used to generate the function, so the fit works well.

So the conclusion is that you don't do 3 fits independently, setting the value of z each time, but instead you do a 2D fit which takes the values of x and z simultaneously to find the best coefficients.

score 0 · Answer 2 · answered Oct 05 '19 at 21:01

Your code is incomplete and has a few syntax errors.

But I think that you want to build a model that concatenates the models for the different data sets, and then fit the concatenated data to that model. Within the context of lmfit (disclosure: author and maintainer), I often find it easier to use minimize() and an objective function for multiple data set fits rather than the Model class. Perhaps start with something like this:

import lmfit
import numpy as np
# define the model function for each dataset 
def my_function(x, c1, c2, z=1): 
    return C1*x**2*np.sin(z)+ C2*x**3*np.cos(z)

# Then write an objective function like this
def f2min(params, x, data2d, zlist):
    ndata, npts = data2d.shape
    residual = 0.0*data2d[:]
    for i in range(ndata):
        c1 = params['c1_%d' % (i+1)].value
        c2 = params['c2_%d' % (i+1)].value
        residual[i,:] = data[i,:] - my_function(x, c1, c2, z=zlist[i])
    return residual.flatten()

# now build that `data2d`, `zlist` and build the `Parameters`
data2d = []
zlist = []
x = None
for fname in dataset_names:
    d = np.loadtxt(fname)  # or however you read / generate data
    if x is None: x = d[:, 0]
    data2d.append(d[:, 1])
    zlist.append(z_for_dataset(fname)) # or however ...

data2d = np.array(data2d)  # turn list into nd array
ndata, npts = data2d.shape
params = lmfit.Parameters()
for i in range(ndata):
    params.add('c1_%d' % (i+1), value=1.0) # give a better starting value!
    params.add('c2_%d' % (i+1), value=1.0) # give a better starting value!

# now you're ready to do the fit and print out the results:
result = lmfit.minimize(f2min, params, args=(x, data2d, zlist))
print(results.fit_report())

That code really a sketch and is all untested, but hopefully will give you a good starting foundation.

@M Newville: why do you prefer minimize() with an objective funtion over Model.fit()? — westr, Oct 22 '19 at 13:50
@westr Well, try writing this example as a Model. It would have to account for multiple data sets, each with its own set of parameters. Do that for 2 datasets and then for 10 datasets. In comparison, doing the sum over N datasets in the objective function is easy. — M Newville, Oct 22 '19 at 18:20

Python3, scipy.optimize: Fit model to multiple datas sets

2 Answers2