I want to do Poisson regression, but I need my regression function to run faster than glm
and with at least as much precision. Consider the following experiment:
## Here is some "data":
da = data.frame(matrix(c(0,1,212,1,0,200,1,1,27), nrow = 3, byrow = TRUE))
names(da) = c("c1", "c2", "c")
## I want to do a Poisson regression of c on c1 and c2 and an intercept.
## Here is my function that uses optim for Poisson regression with the data da to find the intercept term:
zglm2 = function(precision = 1){ #predictors = best.terms, data = ddat, normalized = normalized
# The design matrix
M = as.matrix(cbind(rep(1, nrow(da)), da[,1:2]))
# Initialize beta, the coefficients
beta = rep(0, 3)
# State the log-likelihood (up to a constant) for the data da and parameter beta:
neg.pois.log.like.prop = function(beta){
log.lambda = M%*%beta # log-expected cell counts under poisson model
return(-sum(-exp(log.lambda) + da$c*log.lambda))}
# State the gradient of the log-likelihood:
grad.fun = function(beta){a = exp(M%*%beta)-da$c; return(t(a)%*%M)}
# Estimate the MLE
beta = optim(beta, neg.pois.log.like.prop, method = "BFGS", gr = grad.fun, control = list(reltol = precision*sqrt(.Machine$double.eps)))$par
return(beta[1])}
## Here are two ways of estimating the intercept term:
# Method 1
zglm2(precision = 1)
# Method 2
as.numeric(glm(c ~ 1+c1+c2, data = da, family = poisson)$coefficients[1])
My function, zglm2
uses R's optim
routine to find the maximum likelihood solution to the Poisson regression problem (for this special case). zglm2
takes an argument precision
; values of this argument that are smaller than 1 make optim
run beyond its default termination criteria to achieve greater precision.
Unfortunately, the results of Method 1 and Method 2 are too different (for my purposes); 7.358 versus 7.359. Giving a smaller value, like 0.01, for the precision
argument brings the two methods into reasonable agreement, leading me to suspect that R's glm
function is very precise.
So here is my question: what governs the level of precision in the result of glm
? Perhaps as a sub-question, what algorithm does glm
use to find the maximum of the likelihood (I've dug into the source code but it's not easy for me).