while doing, just a simple implementation of grad descent (predicting a st line, with sample points as input), i pretty accurately predicted the line with iterative method, but using fmin_cg(), the accuracy went down, the first thought was to increase the 'maxiter' parameter in the function, but surprisingly it did not have any affect at all, (results are same with maxiters = 1 and 1000). so two questions in my mind : 1.why is dre no affect of the no. of times fmin_cg(), computes f and fprime, should'nt the accuracy of the result be proportional to it.. 2.does fmin_cg() (if provided with apt fprime) gurantees to return the parameters at which f is minimum possible.
my code :
def gradDesc(theta, data, alpha = None, iterations = None):
X = data[:, 0]
y = data[:, 1]
m = shape(X)[0]
X = c_[ones((m, 1)), X]
y = y.reshape(m, 1)
hypo = X.dot(theta)
grad = zeros((2, 1))
if alpha is not None : #"""iterative method"""
for i in range (0, iterations):
hypo = X.dot(grad)
ausi = X.T.dot(hypo - y)
grad -= alpha / m * ausi
else: #returns derivative of cost(), to use fmin_cg in run()
grad = X.T.dot(hypo.reshape(m, 1) - y)/m
# print(grad)
return grad.flatten()
def run(theta, data ):
result = scipy.optimize.fmin_cg( cost, fprime=gradDesc, x0=theta, \
args = (data, ), maxiter=1, disp=False, full_output=True )
theta = result[0]
minCost = result[1]
return theta, minCost
the cost function :
def cost( theta, data ):
X, y = data[:, 0], data[:, 1]
m = shape(X)[0]
y = y.reshape(m, 1)
X = c_[ones((m, 1)), X]
J = X.dot(theta) - y
# print((J.T.dot(J) / (2*m)) [0, 0])
return (J.T.dot(J) / (2*m)) [0, 0]
full code : http://ideone.com/IbB3Gb (both versions, just commented line no. 4 and 5 need to be toggled) :)