Fitting two sets of data with two different model functions simultaneously giving a unique optimal parameter

Question

I am new to python. Using curve_fit from scipy.optimize, I am trying to fit two sets of data with two different model functions (each for one set of data) simultaneously with the same parameters which should be optimised.

Here a sketch of the code for fitting:

def f(x,a,b,c,d,e): 
   return some function

def g(x,a,b,c,d,e): 
   return some other function

a=1
b=2
c=3
d=4
e=5

guesspar=(a,b,c,d,e)
optimalparf, covf=opti.curve_fit(f,x,ydata1,guesspar,some sigma)

print optimalparf


guesspar=(a,b,c,d,e)
optimalparg, covg=opti.curve_fit(g,x,ydata2,guesspar,some sigma)

print optimalparg

where guesspar is the initial value of parameters, optimalparf and optimalparg are the optimal values I am searching for, ydata1 and ydata2 are the two sets of data, and covf and covg are covariance matrices.

Now, my problem is the following: I do get two different sets of optimal value for guesspar which is obviously wrong, since the optimal values should be the same for the whole diagram, that is, for both model functions. (Besides that, one of the sets of the optimal value is nonsense in the context I am interested in).

I know, that the code I have written here is very misleading. I would appreciate a hint at how one could fit two sets of data with two different functions fitting each set of data at the same time which leads to a unique set of optimal parameter.

PS: Here the original code:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.pyplot import ion 
import time
import random as rd 
import scipy.optimize as opti 
import sys
from numpy import *

n0=1.5 

data =np.genfromtxt('some data') 
data=1000*data

pos=[]
for j in range(len(data)): 
pos.append(np.arcsin(np.sin(np.deg2rad(data[j,0]/1000))/1.5))

m1=[]
for j in range(len(data)): 
m1.append(data[j,1]) 

m1Sig=[]
for j in range(len(data)): 
m1Sig.append(data[j,2]) 

p1=[]
for j in range(len(data)): 
p1.append(data[j,3]) 

p1Sig=[]
for j in range(len(data)): 
p1Sig.append(data[j,4]) 

zero=[]
for j in range(len(data)): 
zero.append(data[j,5]) 

zeroSig=[]
for j in range(len(data)): 
zeroSig.append(data[j,6]) 

#define theta plot-range
thetaMin=-0.5 #[rad]
thetaMax=0.5 
thetaStep=1./635.
theta=np.arange(thetaMin,thetaMax,thetaStep) 
#define r plot-range
rMin=0.02 
rMax=0.09

comboY = np.append(m1, p1)

comboX = np.append(pos, pos)
comboTheta = np.append(theta, theta)

def rM1(theta,lam,d0,deltan,per,y0): 
return y0+((np.pi*deltan*d0)/(lam*np.cos(theta)))**2.*np.sin(np.sqrt(((np.pi*deltan*d0)/(lam*np.cos(theta)))**2.+((np.pi*d0*(-np.arcsin(lam/(2*per*n0))-theta))/per)**2.))**2./(((np.pi*deltan*d0)/(lam*np.cos(theta)))**2.+((np.pi*d0*(-np.arcsin(lam/(2*per*n0))-theta))/per)**2.) 


def rP1(theta,lam,d0,deltan,per,y0): 
return y0+((np.pi*deltan*d0)/(lam*np.cos(theta)))**2.*np.sin(np.sqrt(((np.pi*deltan*d0)/(lam*np.cos(theta)))**2.+((np.pi*d0*(np.arcsin(lam/(2*per*n0))-theta))/per)**2.))**2./(((np.pi*deltan*d0)/(lam*np.cos(theta)))**2.+((np.pi*d0*(np.arcsin(lam/(2*per*n0))-theta))/per)**2.)


def combinedFunction(comboData,lam,d0,deltan,per,y0):

result1 = rM1(theta,lam,d0,deltan,per,y0)
result2 = rP1(theta,lam,d0,deltan,per,y0)

return np.append(result1, result2)

lam1=0.633
d01=100.
deltan1=0.0005
per1=1. 
y01=0.02

m1Err=np.sqrt(m1)
p1Err=np.sqrt(p1)

comboErr=np.append(m1Err,p1Err)

startParam=[lam1, d01 ,deltan1, per1, y01]

popt, pcov = opti.curve_fit(combinedFunction, comboX, comboY, startParam) 
print popt 

lam,d0,deltan,per,y0 = popt

y_fit_1 = rM1(theta,lam,d0,deltan,per,y0) # first data set, first equation
y_fit_2 = rP1(theta,lam,d0,deltan,per,y0) # second data set, second equation

plt.plot(comboX, comboY, '.') # plot the raw data
plt.plot(pos, y_fit_1,'b') # plot the equation using the fitted parameters
plt.plot(pos, y_fit_2,'r') # plot the equation using the fitted parameters
plt.show()

print('lam,d0,deltan,per,y0:', popt)

score 0 · Answer 1 · answered Aug 18 '18 at 14:06

I think this example will help, it has a single shared parameter for two data sets and two functions, just make all of the parameters shared and it should be what you need.

import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

y1 = np.array([ 16.00,  18.42,  20.84,  23.26])
y2 = np.array([-20.00, -25.50, -31.00, -36.50, -42.00])
comboY = np.append(y1, y2)

x1 = np.array([5.0, 6.1, 7.2, 8.3])
x2 = np.array([15.0, 16.1, 17.2, 18.3, 19.4])
comboX = np.append(x1, x2)

if len(y1) != len(x1):
    raise(Exception('Unequal x1 and y1 data length'))
if len(y2) != len(x2):
    raise(Exception('Unequal x2 and y2 data length'))


def function1(data, a, b, c): # not all parameters are used here, c is shared
        return a * data + c

def function2(data, a, b, c): # not all parameters are used here, c is shared
        return b * data + c


def combinedFunction(comboData, a, b, c):
    # single data reference passed in, extract separate data
    extract1 = comboData[:len(x1)] # first data
    extract2 = comboData[len(x1):] # second data

    result1 = function1(extract1, a, b, c)
    result2 = function2(extract2, a, b, c)

    return np.append(result1, result2)


# some initial parameter values
initialParameters = np.array([1.0, 1.0, 1.0])

# curve fit the combined data to the combined function
fittedParameters, pcov = curve_fit(combinedFunction, comboX, comboY, initialParameters)

# values for display of fitted function
a, b, c = fittedParameters

y_fit_1 = function1(x1, a, b, c) # first data set, first equation
y_fit_2 = function2(x2, a, b, c) # second data set, second equation

plt.plot(comboX, comboY, 'D') # plot the raw data
plt.plot(x1, y_fit_1) # plot the equation using the fitted parameters
plt.plot(x2, y_fit_2) # plot the equation using the fitted parameters
plt.show()

print('a, b, c:', fittedParameters)

Thanks! Nice example! But, it didn't work in my case. I actually want to fit two functions which are the same but shifted, i.e., a function like Gaussian bell on the interval (1,4) and another Gaussian bell on the interval (7,10). By the way, the tails of both bells are defined on the whole x-axis. I adapted your code, but it didn't work. I got a straight line as fitting curve. Any idea? — Rose, Aug 20 '18 at 14:31
You gave the intervals, can you post a link to the data and give the two functions? My guess is starting parameters... — James Phillips, Aug 20 '18 at 18:52
I should mention that the scipy.optimize module has a genetic algorithm which can be used to determine initial parameter estimates for curve_fit(). I have an example of using this scipy module to find initial parameter estimates for fitting a double Lorentzian equation to Raman spectroscopy of carbon nanotubes at https://bitbucket.org/zunzuncode/ramanspectroscopyfit — James Phillips, Aug 21 '18 at 13:40
Please excuse me for overseeing your comments! I edited the OP so that you can have a look at what I tried to do. Please see PS. — Rose, Sep 03 '18 at 18:40

score 0 · Answer 2 · answered Aug 21 '18 at 09:58

0

If you wouldn't mind using an additional package which wraps scipy, then the symfit package I wrote would allow you to do this quite simply. Example code:

from symfit import variables, parameters, Fit

# These are your datasets
xdata = np.array([...])
ydata = np.array([...])
zdata = np.array([...])

a, b, c, d, e = parameters('a, b, c, d, e')
x, y, z = variables('x, y, z')

model_dict = {
    y: a * x + b * x**2 + ...,
    z: a / x + b / x**2 + ...
}

fit = Fit(model_dict, x=xdata, y=ydata, z=zdata)
fit_result = fit.execute()
print(fit_result)

That's it! I assumed here that the two datasets share the same x-axis, but even that does not have to be the case. The parameters will now be optimized to give the optimal solution to this system of equations.

For more info, you can find the documentation here.

answered Aug 21 '18 at 09:58

tBuLi

2,295
2
16
16

What method is used in symfit to determine initial parameter estimates? – James Phillips Aug 21 '18 at 13:46
1

@JamesPhillips By default, they are always set to 1. In order to change it, you can set for example `a.value = 5.0` before calling `fit.execute()`. You can also set a range by setting `a.min` and/or `a.max` to some value. If you need something more intelligent you could also run a global minimizer first such as described [here](https://symfit.readthedocs.io/en/latest/fitting_types.html#finding-the-optimal-solution), but this is computationally (much) more expensive so it might be overkill depending on your problem. – tBuLi Aug 21 '18 at 13:57
1

Thanks for your code! I got a problem. When I run my codes according to your method I get the following error: `AttributeError: 'Variable' object has no attribute 'cos'`, since I have 'cos' function in my code. Any idea why I get this error? – Rose Aug 30 '18 at 07:46
I think you probably used `numpy.cos` in your code when defining the `model_dict`? Be careful, the model has to be fully symbolic, not numeric. So you should import `cos` and similar functions from `symfit` instead. E.g. `from symfit import cos, sin, exp` etc. Good luck and let me know if you have other questions or if it works, always nice to get feedback ;). – tBuLi Aug 30 '18 at 08:43
Thanks! I already noticed that I should use `from symfit import cos, pi, asin` and remove `numpy.cos`. But, I still get the same error. Don't know why! – Rose Aug 31 '18 at 10:40
I have removed `cos` function from the model to see what happens. The next error I get is: `AttributeError: 'Add' object has no attribute 'sqrt'`. It seems that none of the functions can be imported from `symfit`. – Rose Aug 31 '18 at 11:10
@Rose, In that case I'm afraid that I cannot help you without seeing your code. Perhaps you can edit your question to include the full minimal working example to raise this error, with the traceback of the exception, or perhaps more neatly you could start a new question specifically for this and link to the answer so I can answer it there. – tBuLi Sep 02 '18 at 09:25
I have started a new question: https://stackoverflow.com/questions/52146955/fitting-two-different-data-sets-by-two-model-functions-using-symfit – Rose Sep 03 '18 at 09:33

Fitting two sets of data with two different model functions simultaneously giving a unique optimal parameter

2 Answers2