6

I am trying to bootstrap a simple multinomial regression in R, and I am getting an error:

Error in is.data.frame(data) : object 'd' not found

What is really strange is that I am using the same code (adjusted to this particular problem) as in a tutorial for boot package at Quick-R, and that same code also worked when I am using different function (like lm()). For sure, I am doing something stupid, but I do not see what. Please, if anyone can help, I would appreciate a lot.

This is an example:

require(foreign)
require(nnet)
require(boot)

# an example for multinomial logistic regression
ml = read.dta('http://www.ats.ucla.edu/stat/data/hsbdemo.dta')
ml = ml[,c(5,7,3)]

bs <- function(formula, data, indices) {
    d = data[indices,] # allows boot to select sample
    fit = multinom(formula, data=d)
    s = summary(fit)
    return(list(fit$coefficients, fit$standard.errors))
}

# 5 replications
results = list()
results <- boot(
    data=ml, statistic=bs, R=5, parallel='multicore',
    formula=prog~write
)
StupidWolf
  • 45,075
  • 17
  • 40
  • 72
striatum
  • 1,428
  • 3
  • 14
  • 31
  • Please post the output of `traceback()` in your question. –  Oct 16 '15 at 12:20
  • I did some sleuthing, and this might lead to a solution: there is something going in with the environment (seen by printing str(fit) from inside the `bs`-function. It is set to global, as you pass a formula and formula's have the enivronment in which they were created (see ?formula). Don't see how to fix it now though. – Heroka Oct 16 '15 at 13:26

2 Answers2

2

The error happens in the summary() part, also the object returned by multinom() does not have coefficients and standard.errors. It seems, that summary.multinom() in turn calculates the hessian from your data, d, which for some reason (probably a scoping issue) cannot be found. A quick fix is to add Hess = TRUE:

bs <- function(formula, data, indices) {
  d = data[indices,] # allows boot to select sample
  fit = multinom(formula, data=d, Hess = TRUE)
  s = summary(fit)
  return( cbind(s$coefficients, s$standard.errors) )
}

# 5 replications
results = list()
results <- boot(
  data=ml, statistic=bs, R=5, parallel='multicore',
  formula=prog~write
)
J.R.
  • 3,838
  • 1
  • 21
  • 25
1

Multinomial logistic regression returns a matrix of coefficients using the coef() function. This differs from a lm or glm model which returns a vector of coefficients.

library(foreign)     # read.dta()
library(nnet)        # multinom()
require(boot)        # boot()

# an example for multinomial logistic regression
ml = read.dta('http://www.ats.ucla.edu/stat/data/hsbdemo.dta')
ml = ml[,c(5,7,3)]

names(ml)

bs <- function(formula, data, indices) {
  d = data[indices,] # allows boot to select sample
  fit = multinom(formula, data=d, maxit=1000, trace=FALSE)
  #s = summary(fit)
  #return(list(fit$coefficients, fit$standard.errors))

  estimates <- coef(fit)
  return(t(estimates))
}

# enable parallel

library(parallel)
cl <- makeCluster(2)
clusterExport(cl, "multinom")

# 10000 replications
set.seed(1984)

results <- boot(
  data=ml, statistic=bs, R=10000, parallel = "snow", ncpus=2, cl=cl,
  formula=prog~write
)

# label the estimates

subModelNames <- colnames(results$t0)
varNames <- rownames(results$t0)

results$t0

estNames <- apply(expand.grid(varNames,subModelNames),1,function(x) paste(x,collapse="_"))

estNames

colnames(results$t) <- estNames

# summary of results

library(car)

summary(results)

confint(results, level=0.95, type="norm")
confint(results, level=0.95, type="perc")
confint(results, level=0.95, type="bca")

# plot the results

hist(results, legend="separate")
William Chiu
  • 388
  • 3
  • 19