-1

I want to do Poisson regression, but I need my regression function to run faster than glm and with at least as much precision. Consider the following experiment:

## Here is some "data":
da = data.frame(matrix(c(0,1,212,1,0,200,1,1,27), nrow = 3, byrow = TRUE))
names(da) = c("c1", "c2", "c")

## I want to do a Poisson regression of c on c1 and c2 and an intercept.

## Here is my function that uses optim for Poisson regression with the data da to find the intercept term:
zglm2 = function(precision = 1){  #predictors = best.terms, data = ddat, normalized = normalized
  # The design matrix
  M = as.matrix(cbind(rep(1, nrow(da)), da[,1:2]))
  # Initialize beta, the coefficients
  beta = rep(0, 3)
  # State the log-likelihood (up to a constant) for the data da and parameter beta:
  neg.pois.log.like.prop = function(beta){
    log.lambda = M%*%beta # log-expected cell counts under poisson model
    return(-sum(-exp(log.lambda) + da$c*log.lambda))}
  # State the gradient of the log-likelihood:
  grad.fun = function(beta){a = exp(M%*%beta)-da$c; return(t(a)%*%M)}
  # Estimate the MLE
  beta = optim(beta, neg.pois.log.like.prop, method = "BFGS", gr = grad.fun, control = list(reltol = precision*sqrt(.Machine$double.eps)))$par 
  return(beta[1])}

## Here are two ways of estimating the intercept term:
# Method 1
zglm2(precision = 1)
# Method 2
as.numeric(glm(c ~ 1+c1+c2, data = da, family = poisson)$coefficients[1])

My function, zglm2 uses R's optim routine to find the maximum likelihood solution to the Poisson regression problem (for this special case). zglm2 takes an argument precision; values of this argument that are smaller than 1 make optim run beyond its default termination criteria to achieve greater precision.

Unfortunately, the results of Method 1 and Method 2 are too different (for my purposes); 7.358 versus 7.359. Giving a smaller value, like 0.01, for the precision argument brings the two methods into reasonable agreement, leading me to suspect that R's glm function is very precise.

So here is my question: what governs the level of precision in the result of glm? Perhaps as a sub-question, what algorithm does glm use to find the maximum of the likelihood (I've dug into the source code but it's not easy for me).

zkurtz
  • 3,230
  • 7
  • 28
  • 64

1 Answers1

0

I'm having difficulty believing that "you have dug into the code" very deeply, since there is a a "control" parameter and a function that delivers argumetn to glm:

?glm
# control   
#        a list of parameters for controlling the fitting process. 
#        For glm.fit this is passed to glm.control.
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Yes, I had looked at the `control` argument documentation but did not find it very helpful. At any rate, I have become dissatisfied with my question and would like to remove it. – zkurtz Sep 11 '13 at 00:35