I'm trying to fit a zero-inflated negative binomial distribution to count data by minimizing the negative of the log likelihood function with optim. I've defined the negative log likelihood with 'newfunct' below:
newfunct <- function(theta,xvec,obs_freq){
theta[2] <- exp(theta[2]) + 10^-10
theta[3] <- exp(theta[3]) + 10^-10
obs_freq <- c(172,33,14,12,5,5,1,1,1,4,5,1,1,2,1,2,1,1,1,1,1,2,1,1)
xvec <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,17,19,21,23,24,28,29,41,49,53)
gfunct <- function(mu,k,xvec) {
(gamma(1/k+xvec)*(k*mu)^xvec)/(factorial(xvec)*gamma(1/k)*
(1+k*mu)^(1/k+xvec))
}
f <- 3861*log(theta[1]+(1-
theta[1])/((1+theta[2]*theta[3])^(1/theta[3])))+sum(obs_freq*log((1-
theta[1])*gfunct(theta[2],theta[3],xvec)))
return(-f)
}
theta contains my 3 parameters, xvec contains the count values, and obs_freq contains the observed frequencies of those values. I am expecting all 3 parameters to be > 0, but optim returns negative values. For example:
> optim(theta <- c(.9,log(5),log(2)),newfunct,hessian = TRUE)
$par
[1] -29.140687 -4.912860 6.872521
$value
[1] 1464.211
$counts
function gradient
502 NA
$convergence
[1] 1
$message
NULL
$hessian
[,1] [,2] [,3]
[1,] 0.3167963 -4.003138 5.549092
[2,] -4.0031384 116.634414 -4.014256
[3,] 5.5490918 -4.014256 163.029903
Am I using the incorrect optimization method, or are my initial parameter values defined incorrectly? I've been struggling with this particular problem for a long time, and I'd appreciate any insight as to why I'm getting the negative par values. Thanks!