1

I am trying to getting familiar with the non linear fitting procedure: dual-annealing. To do so I generated some synthetic data and try to fit over them a basic Furth formula, see the code below:

import numpy as np
from numpy import savetxt
from numpy import genfromtxt
import random
import matplotlib.pyplot as plt 
from scipy.optimize import curve_fit
from lmfit import Minimizer, Parameters, report_fit
from scipy.optimize import dual_annealing
def Furth(A,B, a, x):
    model = A*x + B * (np.exp(-a*x)-1) 
    return model 

generating synthetic data:

x = np.arange(200)*0.5
x = x[1:]
A =1.88
B = 2.35
a = 5602
y = Furth(A,B,a,x) + np.random.randn(x.size)

Defining the function to fit:

def fit_msd2(params, x, data):
    A = params['A']
    B = params['B']
    c = params['c']

    model = A*x + B * (np.exp(-c*x)-1) 
    return model - data
params = Parameters()
params.add('A',   min=0, max = 100000)
params.add('B',   min=-100, max = 100000)
params.add('c',   min=0,max = 100000)
from scipy.optimize import dual_annealing
# do fit, here with the default leastsq algorithm

minner = Minimizer(fit_msd2, params, fcn_args=(x, y))
print(minner)
result = minner.minimize(method="dual_annealing")
print(result)
# calculate final result
final = x + result.residual
#print(final)

# write error report
report_fit(result)
fig, ax = plt.subplots(figsize = (10,10))
ax.grid()
ax.set_ylabel('$\Delta_{msd}$(t) [$\mu$m]', fontsize=18)
ax.set_xlabel('Time Lag $\Delta t$ [s]', fontsize=18)
ax.loglog(x, y, label = 'Synthetic data')
ax.loglog(x, final, 'r-', linewidth=3,label='Fit')
ax.legend(loc='best', fontsize = 12)

Graphical Fit results

I believe the fit is not working properly and I cannot understand which are the reasons. Is there any library or check routing I can use in order to evaluate this? If not, can someone suggest another way to perform this simulated annealing fit?

Thanks in advance

  • Do you have a link where the Furth formula is explained? What `A, B, c` do you get after fitting? – André Jul 26 '21 at 11:37
  • For the Furth formula you can take a look at Eq. 2: https://pagines.uab.cat/vmendez/sites/pagines.uab.cat.vmendez/files/72.pdf , – namedunframed Jul 26 '21 at 11:50
  • Regarding the parameters of my fit: A=1.87, B=2.25, c=78977 – namedunframed Jul 26 '21 at 11:52
  • Can you add plots of `(x,y)` with and without the normal noise? What happens if you fit without the normal noise? S.t. `y = Furth(A,B,a,x)` – André Jul 26 '21 at 12:20
  • Without the noise the fit still results very far away from what it should capture. I cannot plug in the comment images, I'll comment the post below – namedunframed Jul 26 '21 at 12:43
  • you can edit the original question to append the images :) My next suggestion will be a bit too large for the comment section, I will create an answer for my best guess. – André Jul 26 '21 at 12:53
  • you really, really need to give physically reasonable starting values for every parameter, and that starting value should not be at a boundary value. Boundaries should be used to set physical limits. Finally, while it can be useful to look at a plot of the fit, it is always (really, always) vital to look at the actual fit report. Without the fit report you do not know what the fit did. – M Newville Jul 27 '21 at 14:15
  • I am not really sure how in my particular case the fit report will be helpful. In this case I am just trying to play around with synthetic data in order to be able to later on guess the parameters for the experimental data. The latter are time-correlated between in each other though and classical statistic does not work anymore. Hence Chi-squared and others won't be reliable no more. – namedunframed Jul 28 '21 at 10:47

1 Answers1

0

My best guess would be to play around with the parameters of the optimizer. As it seems that your loss is too large, the optimizer currently accepts a solution too early. Since lmfit builds on scipy, it might be possible to pass the arguments of the optimizer as such:

# [...]
opt_args = {
  maxiter=5000,  # originally 1000
  initial_tempfloat=7500  # originally 5230
  accept=-2.  # originally -5.0
}
minner = Minimizer(fit_msd2, params, fcn_args=(x, y))
result = minner.minimize(method="dual_annealing", **opt_args)

The above parametrization makes the optimizers more explorative, especially in the beginning of the optimization and lets it optimize for longer. A full list of the dual_annealing parameters can be found here.

Does this lead to an improvement?

Also, I would suggest to use the square your error:

def fit_msd2(params, x, data):
    A = params['A']
    B = params['B']
    c = params['c']

    model = A*x + B * (np.exp(-c*x)-1) 
    return (model - data)**2  # or np.pow(model - data, 2), depending on data type
André
  • 1,034
  • 9
  • 19
  • There are no improvements unfortunately. But thank you for the help! – namedunframed Jul 26 '21 at 13:31
  • You can also try to use the squared error. Besides that, I fail to see any errors in your code. Fiddling around with the optimizer might still be worth it, as it should essentially be able to find good values for the approximation. – André Jul 26 '21 at 13:40
  • 1
    I am just guessing that this fitting routine it is not able to go beyond the local minima of my function and I simply have to fit a more robust procedure to able able to solve the problem. – namedunframed Jul 26 '21 at 13:49