I am using the example from here, where the original post had an objective function returning a list, with first element equal to the value of the objective function and the second element the gradient:
logisticRegressionCost <- function(theta, X, y) {
J = 0;
theta = as.matrix(theta);
X = as.matrix(X);
y = as.matrix(y);
rows = dim(theta)[2];
cols = dim(theta)[1];
grad = matrix(0, rows, cols);
predicted = sigmoid(X %*% theta);
J = (-y) * log(predicted) - (1 - y) * log(1 - predicted);
J = sum(J) / dim(y)[1];
grad = t(predicted - y);
grad = grad %*% X;
grad = grad / dim(y)[1];
return(list(fn = J, gr = t(grad)));
}
The suggested solution to use optim
is to split this into two separate functions that serve as wrappers, e.g.:
fn <- function(...){
logisticRegressionCost(...)$fn
}
gr <- function(...){
logisticRegressionCost(...)$gr
}
and thus optim
can be called like optim(fn = fn, gr = gr, ...)
.
However, this is unsatisfactory as computation of the gradient generally relies on shared computations with the objective function. In this case, the line:
predicted = sigmoid(X %*% theta);
will definitely be duplicated.
Is there a way to use optim
so that shared computations between the objective function and gradient are efficient performed?