Python - Minimizing Chi-squared

Question

I have been trying to fit a linear model to a set of stress/strain data by minimizing chi-squared. Unfortunately using the code below is not correctly minimizing the chisqfunc function. It is finding the minimum at the initial conditions, x0, which is not correct. I have looked through the scipy.optimize documentation and tested minimizing other functions which has worked correctly. Could you please suggest how to fix the code below or suggest another method I can use to fit a linear model to data by minimizing chi-squared?

import numpy
import scipy.optimize as opt

filename = 'data.csv'

data = numpy.loadtxt(open(filename,"r"),delimiter=",")

stress = data[:,0]
strain = data[:,1]
err_stress = data[:,2]

def chisqfunc((a, b)):
    model = a + b*strain
    chisq = numpy.sum(((stress - model)/err_stress)**2)
    return chisq

x0 = numpy.array([0,0])

result =  opt.minimize(chisqfunc, x0)
print result

Thank you for reading my question and any help would be greatly appreciated.

Cheers, Will

EDIT: Data set I am currently using: Link to data

The code looks fine to me, and provides good results on some mock data I created on my machine. Just one note: is your variable `err_stress` the uncertainty in the measured stress for different values of deformation correct? — gg349, Mar 04 '14 at 16:49
one trivial point: you're checking the result with `print result.x` right? minimize does NOT update `x0`. — gg349, Mar 04 '14 at 16:53
Thanks for the quick reply, I did use `print result.x`. Yes `err_stress` is an array of errors for the array of measured stresses. — Will282, Mar 04 '14 at 17:41
What are `result.success` and `result.status` after calling `minimize`? — Warren Weckesser, Mar 04 '14 at 21:28
`result.success` is `True`, `result.status` is 0 and `result.x` is `[0 0]`. Thanks for the assistance Warren. — Will282, Mar 04 '14 at 23:31
@flebool , I have uploaded a the set of data that I'm currently using: [Link to data](https://drive.google.com/file/d/0B1YR_XKF3wtmYl9pNHJHWEZFU3M/edit?usp=sharing) — Will282, Mar 05 '14 at 14:22

score 5 · Accepted Answer · answered Mar 05 '14 at 16:17

The problem is that your initial guess is very far from the actual solution. If you add a print statement inside chisqfunc() like print (a,b), and rerun your code, you'll get something like:

(0, 0)
(1.4901161193847656e-08, 0.0)
(0.0, 1.4901161193847656e-08)

This means that minimize evaluates the function only at these points.

if you now try to evaluate chisqfunc() at these 3 pairs of values, you'll see that they EXACTLY match, for example

print chisqfunc((0,0))==chisqfunc((1.4901161193847656e-08,0))
True

This happens because of rounding floating points arithmetics. In other words, when evaluating stress - model, the var stress is too many order of magnitude larger than model, and the result is truncated.

One could then just try bruteforcing it, increasing floating point precision, with writing data=data.astype(np.float128) just after loading the data with loadtxt. minimize fails, with result.success=False, but with a helpful message

Desired error not necessarily achieved due to precision loss.

One possibility is then to provide a better initial guess, so that in the subtraction stress - model the model part is of the same order of magnitude, the other to rescale the data, so that the solution will be closer to your initial guess (0,0).

It is MUCH better if you just rescale the data, making for example nondimensional with respect to a certain stress value (like the yelding/cracking of this material)

This is an example of the fitting, using as a stress scale the maximum measured stress. There are very few changes from your code:

import numpy
import scipy.optimize as opt

filename = 'data.csv'

data = numpy.loadtxt(open(filename,"r"),delimiter=",")

stress = data[:,0]
strain = data[:,1]
err_stress = data[:,2]


smax = stress.max()
stress = stress/smax
#I am assuming the errors err_stress are in the same units of stress.
err_stress = err_stress/smax

def chisqfunc((a, b)):
    model = a + b*strain
    chisq = numpy.sum(((stress - model)/err_stress)**2)
    return chisq

x0 = numpy.array([0,0])

result =  opt.minimize(chisqfunc, x0)
print result
assert result.success==True
a,b=result.x*smax
plot(strain,stress*smax)
plot(strain,a+b*strain)

Your linear model is quite good, i.e. your material has a very linear behaviour for this range of deformation (what material is it anyway?): enter image description here

Ahh, I tried using the results of `stats.linregress` as the initial conditions which also didn't work unfortunately. I think the reason it didn't work is because the stress values are so large not much change in `chisqfunc` is experienced with small change in `x0`. Rescaling using `smax` worked a treat. Thank you very much for your help on this. This data is a sample of a data set (linear section only) for the extension of a Cu98%/Be2% wire. — Will282, Mar 06 '14 at 11:35

Python - Minimizing Chi-squared

1 Answers1

Linked