0

I would like to fit a line that uses the inverse of the error bars as weights.

I binned my data x and y, into 10 bins ( ~26 points each ) and took their mean. This is not a matrix of values so polyfit isn't happy with that.

# note I didn't test this pseudo code....
import numpy as np
import scipy

x = np.random.randn(100)
y = np.random.rand(100)

x_bins = np.linspace(x.min(), x.max(), 10) # reduction of data into 10 bins 
y_bins = np.linspace(y.min(), y.max(), 10) 

x_bin = np.digitize(x, x_bins)
y_bin = np.digitize(y, y_bins)

x_mu = np.zeros(10)
y_mu = x_mu.copy()
err = x_mu.copy()   

for i in range(10):
    x_mu = np.mean(x[x_bin==i]) 
    y_mu = np.mean(y[y_bin==i])
    err = np.std([y[y_bin==i])


x_mu[np.isnan(x_mu)] = 0
y_mu[np.isnan(y_mu)] = 0
errror[np.isnan(error)] = 0

plt.errorbar(x_mu, y_mu, err, fmt='o')

EDIT: scipy.polyfit stopped complaining about ill conditioned inputs...

out = scipy.polyfit(x_mu, y_mu, deg=1, w=error)
CT Zhu
  • 52,648
  • 17
  • 120
  • 133
wbg
  • 866
  • 3
  • 14
  • 34
  • Can you show an actual, tested example? Above `x` ranges goes from 0 to 1, but you seem to histogram in 0 to 100. – Benjamin Bannier Sep 26 '14 at 05:47
  • +1 to Benjamin Bannier Fixing my example code helped me solve some dumb issue and now polyfit works... :/ – wbg Sep 26 '14 at 06:15

1 Answers1

1

A numpy.polyfit does not allow you to explicitly specify uncertainties. Instead you could use scipy.optimize.curve_fit, e.g.

import numpy as np
import scipy
import scipy.optimize

x = np.linspace(0,1, 100)
y = np.random.rand(100)

# bin the data
n, bins = np.histogram(y, 10, [0, 1])
xb = bins[:-1] + 0.05  # at bin center; has overflow bin
yb = n                 # just the per-bin counts
err = sqrt(n)          # from Poisson statistics
plt.errorbar(xb, yb, err, fmt='ro')

# fit a polynomial of degree 1, no explicit uncertainty
a1, b1 = np.polyfit(xb, yb, 1)
plt.plot(xb, a1*xb + b1, 'b') 

# fit explicitly taking uncertainty into account
f = lambda x, a, b: a*x + b  # function to fit
# fit with initial guess for parameters [1, 1]
pars, corr = scipy.optimize.curve_fit(f, xb, yb, [1, 1], err)
a2, b2 = pars
plt.plot(xb, a2*xb + b2, 'r')

enter image description here

To properly interpret the fit you need to examine the correlation matrix of the fitted parameters, but that goes beyond this technical question.

Benjamin Bannier
  • 55,163
  • 11
  • 60
  • 80
  • I like your example..it's easier than the cook book example that honestly was too much for me this late at night... – wbg Sep 26 '14 at 06:33
  • I'd like to ask you about interpreting the fit in another thread if you are interested... – wbg Sep 26 '14 at 17:27