lmfit and minimize function for multiple datasets

Question

I have a problem with the script used to simultaneously describe a set of data with some common parameters. The program is quite long and complex, and I am reporting the essential parts here for the sake of simplicity. Full script and a set of data are avaible at the end of this post.

In the first step, I import the data to be fitted. qexp are the values of my independent variable, RRexp is my experimental data to be modelled. RRerr is their uncertainty and qres is the uncertainty in my independent variable (not used here).

data = []
Files = ['LC-VoS-2.4-00.mft', 'LC-VoS-2.4-10.mft', 'LC-VoS-2.4-30.mft', 'LC-VoS-2.4-50.mft'] #, 'LC-VoS-2.3-65.mft']
for i in range(len(Files)):
  qexp, RRexp, RRerr, qres = np.loadtxt(Files[i], unpack=True, skiprows=23, usecols=(0,1,2,3))
  dataset = np.zeros(shape=(len(qexp),5))
  dataset[:,0:4] = np.column_stack((qexp*10, RRexp, RRerr, qres*10))
  data.append(dataset)

The I define all the parameters I need to calculate the theoretical curves (I am showing just some of them):

fit_params = Parameters()
#Parameters valid for each humidity series, values guessed  
fit_params.add('N', value = 3.85 , min = 1.0, max = 15.0, vary=True) 
fit_params.add('dd_alk', value = 0.80, min = 0.4, max = 2.5, vary=True) 
...
for i in range(len(Files)):
fit_params.add_many(('sig_l' + str(i), 0.15, False, 0.0, None),   
                    ('sig_N' + str(i), 0.3, False, 0.0,  None))

This function returns the residuals to the f2min_inc function, called by minimize. data1 and data2 are calculated by the calcRR function, using the x-values (qexp) contained in the data array, the SLDx arrays and some parameters contained in GFP[]. The calculated values are stored in data[:][:,4].

def Resid_inc(Files,SLD1,SLD2,data,GFP): #Function calculates the residuum for the fit. 
  res = []
  data1 = CalcRR(data, SLD1, GFP)
  data2 = CalcRR(data, SLD2, GFP)
  for i in range(len(Files)):
        data[i][:,4] = data1[i][:,4]*(1.0-fit_params['N'].value+int(fit_params['N'].value))+data2[i][:,4]*(fit_params['N'].value-int(fit_params['N'].value))
  for i in range(len(Files)):
    linres = (data[i][:,1]-data[i][:,4])/data[i][:,2]
    res = np.append(res, linres)
  return res

This is an intermediate function (which can be probably skipped), called by the minimize function, returns the residuals. In the first line, four arrays, Phi_s1, Phi_s2, SLD1, and SLD2, are calculated based on the fit_parameters. These arrays are needed to calculate the theoretical curve, which is done in the Resid_inc function.

def f2min_inc(fit_params):
  Phi_s1, Phi_s2, SLD1, SLD2 = nr.Par(fit_params, Files, GFP, coher)
  res = Resid_inc(Files,SLD1,SLD2,data,GFP)
  return res

Here I am calling the lmfit minimize function. out = minimize(f2min_inc, fit_params, iter_cb=printout, method='leastsq')

Though I did not define args in minimize, the software works smoothly, no error messages appear, the fitting procedure runs for a while and then ends. However, the change in the parameters is minimal, and they are returned without standard deviation. The report looks as follows:

[[Fit Statistics]]
    # function evals   = 103
    # data points      = 1193
    # variables        = 4
    chi-square         = 290351.201
    reduced chi-square = 244.198
[[Variables]]
    N:            3.85000000 +/- 0        (0.00%) (init= 3.85)
    dd_alk:       0.80196356 +/- 0        (0.00%) (init= 0.8)
    dd_eo:        0.40098178 +/- 0        (0.00%)  == 'dd_alk*0.34/0.34/2.'
    dd_ch:        2.09873200 +/- 0        (0.00%) (init= 2.1)
    dd_ch0:       1.49802709 +/- 0        (0.00%) (init= 1.5)
    dd_ch_last:   0.5 (fixed)
    dd_SiO2:      2.3 (fixed)
    sig_Si:       0.2 (fixed)
    sig_SiO2:     0.25 (fixed)
    a:            0 (fixed)
    b:            0          +/- 0          == 'a'
    sig_l0:       1 (fixed)
    sig_N0:       0.05 (fixed)
    f_eo0:        0 (fixed)
    f_ch0:        0 (fixed)
    f_ch00:       0 (fixed)
    f_holech0:    0.45 (fixed)
    f_holeo0:     0.45000000 +/- 0        (0.00%)  == 'f_holech0'
    sig_l1:       1 (fixed)
    sig_N1:       0.1 (fixed)
    f_eo1:        0.1 (fixed)
    f_ch1:        0.05 (fixed)
    f_ch01:       0.05 (fixed)
    f_holech1:    0 (fixed)
    f_holeo1:     0 (fixed)
    sig_l2:       1 (fixed)
    sig_N2:       0.08 (fixed)
    f_eo2:        0.2 (fixed)
    f_ch2:        0.15 (fixed)
    f_ch02:       0.15 (fixed)
    f_holech2:    0 (fixed)
    f_holeo2:     0 (fixed)
    sig_l3:       1 (fixed)
    sig_N3:       0.08 (fixed)
    f_eo3:        0.5 (fixed)
    f_ch3:        0.28 (fixed)
    f_ch03:       0.42 (fixed)
    f_holech3:    0 (fixed)
    f_holeo3:     0 (fixed)

As you can see, the parameters were slightly, but insignificantly, changed. I thought this might arise to the complexity of the model. However, the problem persists even if I try to minimize one single parameter. Also by changing minimization algorithm has no effect on the output, which remains always very close to the guessed parameters. I am also sure, that I am not at the minimum, as I can get a better guess by changing the initial parameters.

I fear the problem lies in how I wrote the script, even though I cannot identify any evident problem (but my programming skills are quite limited).

The full script with a set of data can be downloaded here: https://tubcloud.tu-berlin.de/s/4oUzu6d5Mvwczle

Thanks in advance for your comments.

score 0 · Answer 1 · answered Oct 04 '17 at 02:24

It is always helpful to reduce the problem to the simplest, minimal example. With a complicated example, there are just too many places to examine. To be sure, lmfit can handle complicated examples with many parameters and complex functions, but you have to either build these up from simple versions or test the different components.

And, if you do not include complete code (such as nr.Par(), which is where the fit parameters get sent) how do you expect us to know what the code is doing? We can only guess -- why are you asking for help and deliberately not giving all the information you have available?

OK, but also: your Resid_inc uses fit_params['N']. Where does this come from? Did you expect that it would use the fitting parameters?

Since your parameter N is not changed at all in the fit (not even slightly), and since you use fit_params['N'] in Resid_inc without passing it in, I will hazard a guess:

Resid_inc is the only place you use the value of the parameter N.
you are under the mistaken impression that the fit_params passed into your f2min_inc will be those used in Resid_inc. In fact, they won't be. In each call, f2min_inc will be given a lmfit.Parameters with updated values that will be put into the object named fit_params within (and local to) the function f2min_inc. The values in your outer module code, and that you pass into minimize() also happen to be named fit_params, but this is not the same object, and these values will not be altered.

In short, I think you want to pass fit_params to your Resid_inc function.

Thanks for the time you took looking at the question. To the first point, as mentioned in the question, I attached the full code with a set of data, to deliberately give to you all the information I have. I posted in question the essentials of the code, as I thought it helps to explain the problem. Indeed, by passing fit_params to Resid_inc solves the problem. Thanks for the suggestion. — leonardo2887, Oct 04 '17 at 08:04

lmfit and minimize function for multiple datasets

1 Answers1