I have the following code to minimize the Cost Function with its gradient.
def trainLinearReg( X, y, lamda ):
# theta = zeros( shape(X)[1], 1 )
theta = random.rand( shape(X)[1], 1 ) # random initialization of theta
result = scipy.optimize.fmin_cg( computeCost, fprime = computeGradient, x0 = theta,
args = (X, y, lamda), maxiter = 200, disp = True, full_output = True )
return result[1], result[0]
But I am having this warning:
Warning: Desired error not necessarily achieved due to precision loss.
Current function value: 8403387632289934651424768.000000
Iterations: 0
Function evaluations: 15
Gradient evaluations: 3
My computeCost
and computeGradient
are defined as
def computeCost( theta, X, y, lamda ):
theta = theta.reshape( shape(X)[1], 1 )
m = shape(y)[0]
J = 0
grad = zeros( shape(theta) )
h = X.dot(theta)
squaredErrors = (h - y).T.dot(h - y)
# theta[0] = 0.0
J = (1.0 / (2 * m)) * (squaredErrors) + (lamda / (2 * m)) * (theta.T.dot(theta))
return J[0]
def computeGradient( theta, X, y, lamda ):
theta = theta.reshape( shape(X)[1], 1 )
m = shape(y)[0]
J = 0
grad = zeros( shape(theta) )
h = X.dot(theta)
squaredErrors = (h - y).T.dot(h - y)
# theta[0] = 0.0
J = (1.0 / (2 * m)) * (squaredErrors) + (lamda / (2 * m)) * (theta.T.dot(theta))
grad = (1.0 / m) * (X.T.dot(h - y)) + (lamda / m) * theta
return grad.flatten()
I have reviewed these similar questions:
scipy.optimize.fmin_bfgs: “Desired error not necessarily achieved due to precision loss”
scipy.optimize.fmin_cg: "'Desired error not necessarily achieved due to precision loss.'
scipy is not optimizing and returns "Desired error not necessarily achieved due to precision loss"
But still cannot have the solution to my problem. How to let the minimization function process converge instead of being stuck at first?
ANSWER:
I solve this problem based on @lejlot 's comments below.
He is right. The data set X
is to large since I did not properly return the correct normalized value to the correct variable. Even though this is a small mistake, it indeed can give you the thought where should we look at when encountering such problems. The Cost Function value is too large leads to the possibility that there are some wrong with my data set.
The previous wrong one:
X_poly = polyFeatures(X, p)
X_norm, mu, sigma = featureNormalize(X_poly)
X_poly = c_[ones((m, 1)), X_poly]
The correct one:
X_poly = polyFeatures(X, p)
X_poly, mu, sigma = featureNormalize(X_poly)
X_poly = c_[ones((m, 1)), X_poly]
where X_poly
is actually used in the following traing as
cost, theta = trainLinearReg(X_poly, y, lamda)