7

I am trying to use the command mle2, in the package bbmle. I am looking at p2 of "Maximum likelihood estimation and analysis with the bbmle package" by Bolker. Somehow I fail to enter the right start values. Here's the reproducible code:

l.lik.probit <-function(par, ivs, dv){
Y <- as.matrix(dv)
X <- as.matrix(ivs)
K <-ncol(X)
b <- as.matrix(par[1:K])
phi <- pnorm(X %*% b) 
sum(Y * log(phi) + (1 - Y) * log(1 - phi)) 
}

n=200

set.seed(1000)

x1 <- rnorm(n)
x2 <- rnorm(n)
x3 <- rnorm(n) 
x4 <- rnorm(n) 

latentz<- 1 + 2.0 * x1 + 3.0 * x2 + 5.0 * x3 + 8.0 * x4 + rnorm(n,0,5)

y <- latentz 
y[latentz < 1] <- 0 
y[latentz >=1] <- 1 
x <- cbind(1,x1,x2,x3,x4)
values.start <-c(1,1,1,1,1)   

foo2<-mle2(l.lik.probit, start=list(dv=0,ivs=values.start),method="BFGS",optimizer="optim", data=list(Y=y,X=x)) 

And this is the error I get:

Error in mle2(l.lik.probit, start = list(Y = 0, X = values.start), method = "BFGS",  : 
  some named arguments in 'start' are not arguments to the specified log-likelihood function

Any idea why? Thanks for your help!

EOM
  • 71
  • 2
  • 1
    `values.start` is not specified. You have to define it. There is also a typo in `foo2<<-`. –  May 15 '12 at 13:12
  • Thanks for the quick answer! I made those changes (my starting values being values.start <-c(1,1,1,1,1) ), but I still get the same error message. I believe there is some incongruity between the mle2 command and the function I specified, but I cannot figure it out for the life of me! – EOM May 15 '12 at 14:11
  • 1
    Are you implementing a [probit regression](http://www.ats.ucla.edu/stat/R/dae/probit.htm)? –  May 15 '12 at 14:33
  • I am doing a heteroskedastic probit, the formula is almost the same, so I am reproducing here the simple probit, which gives me the exact same problem, the starting values seem to be misspecified...but why? – EOM May 15 '12 at 15:39
  • The formulation of `l.lik.probit` is strange, because it assigns *external* values `x` and `y` to its *arguments* `X` and `Y`. Also, the call to `mle2` uses named parameters "X" and "Y" twice each in two conflicting ways. I therefore suspect you may be reading an error message resulting from a long cascade of errors and it might not reflect all the things going wrong. Maybe you should first try the examples on the `mle2` manual page and then gradually modify them to fit your circumstances. – whuber May 15 '12 at 21:19
  • I tried to change the names, like you suggested, thanks! Unfortunately it does not work...keep reading the help file for mle2, but there seems to be a problem in the way start values are specified and I cannot figure it out – EOM May 15 '12 at 21:55

1 Answers1

6

You've missed a couple of things, but the most important is that by default mle2 takes a list of parameters; you can make it take a parameter vector instead, but you have to work a little bit harder.

I have tweaked the code slightly in places. (I changed the log-likelihood function to a negative log-likelihood function, without which this would never work!)

l.lik.probit <-function(par, ivs, dv){
    K <- ncol(ivs)
    b <- as.matrix(par[1:K]) 
    phi <- pnorm(ivs %*% b)
    -sum(dv * log(phi) + (1 - dv) * log(1 - phi)) 
}

n <- 200

set.seed(1000)

dat <- data.frame(x1=rnorm(n),
                  x2=rnorm(n),
                  x3=rnorm(n),
                  x4=rnorm(n))

beta <- c(1,2,3,5,8)
mm <- model.matrix(~x1+x2+x3+x4,data=dat)
latentz<- rnorm(n,mean=mm%*%beta,sd=5)

y <- latentz 
y[latentz < 1] <- 0 
y[latentz >=1] <- 1
x <- mm
values.start <- rep(1,5)

Now we do the fit. The main thing is to specify vecpar=TRUE and to use parnames to let mle2 know the names of the elements in the parameter vector ...

library("bbmle")
names(values.start) <- parnames(l.lik.probit) <- paste0("b",0:4)
m1 <- mle2(l.lik.probit, start=values.start,
           vecpar=TRUE,
           method="BFGS",optimizer="optim",
           data=list(dv=y,ivs=x))

As pointed out above for this particular example you have just re-implemented the probit regression (although I understand that you now want to extend this to allow for heteroscedasticity in some way ...)

dat2 <- data.frame(dat,y)
m2 <- glm(y~x1+x2+x3+x4,family=binomial(link="probit"),
    data=dat2)

As a final note, I would say that you should check out the parameters argument, which allows you to specify a sub-linear model for any one of the parameters, and the formula interface:

m3 <- mle2(y~dbinom(prob=pnorm(eta),size=1),
           parameters=list(eta~x1+x2+x3+x4),
           start=list(eta=0),
           data=dat2)

PS confint(foo2) appears to work fine (giving profile CIs as requested) with this set-up.

ae <- function(x,y) all.equal(unname(coef(x)),unname(coef(y)),tol=5e-5)
ae(m1,m2) && ae(m2,m3)
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453