1

So I'm using the quantreg package in R to conduct quantile regression analyses to test how the effects of my predictors vary across the distribution of my outcome.

FML <- as.formula(outcome ~ VAR + c1 + c2 + c3)
quantiles <- c(0.25, 0.5, 0.75)

q.Result <- list()
for (i in quantiles){
    i.no <- which(quantiles==i)
    q.Result[[i.no]] <- rq(FML, tau=i, data, method="fn", na.action=na.omit)
}

Then i call anova.rq which runs a Wald test on all the models and outputs a pvalue for each covariate telling me whether the effects of each covariate vary significantly across the distribution of my outcome.

anova.Result <- anova(q.Result[[1]], q.Result[[2]], q.Result[[3]], joint=FALSE)

Thats works just fine. However, for my particular data (and in general?), bootstrapping my estimates and their error is preferable. Which i conduct with a slight modification of the code above.

q.Result <- rqs(FML, tau=quantiles, data, method="fn", na.action=na.omit)
q.Summary <- summary(Q.mod, se="boot", R=10000, bsmethod="mcmb",
                            covariance=TRUE)

Here's where i get stuck. The quantreg currently cannot peform the anova (Wald) test on boostrapped estimates. The information files on the quantreg packages specifically states that "extensions of the methods to be used in anova.rq should be made" regarding the boostrapping method.

Looking at the details of the anova.rq method. I can see that it requires 2 components not present in the quantile model when bootstrapping.

1) Hinv (Inverse Hessian Matrix). The package information files specifically states "note that for se = "boot" there is no way to split the estimated covariance matrix into its sandwich constituent parts."

2) J which, according to the information files, is "Unscaled Outer product of gradient matrix returned if cov=TRUE and se != "iid". The Huber sandwich is cov = tau (1-tau) Hinv %*% J %*% Hinv. as for the Hinv component, there is no J component when se == "boot". (Note that to make the Huber sandwich you need to add the tau (1-tau) mayonnaise yourself.)"

Can i calculate or estimate Hinv and J from the bootstrapped estimates? If not what is the best way to proceed?

Any help on this much appreciated. This my first timing posting a question here, though I've greatly benefited from the answers to other peoples questions in the past.

lmo
  • 37,904
  • 9
  • 56
  • 69

2 Answers2

0

For question 2: You can use R = for resampling. For example:

anova(object, ..., test = "Wald", joint = TRUE, score =
"tau", se = "nid", R = 10000, trim = NULL)

Where R is the number of resampling replications for the anowar form of the test, used to estimate the reference distribution for the test statistic.

Just a heads up, you'll probably get a better response to your questions if you only include 1 question per post.

scribbles
  • 4,089
  • 7
  • 22
  • 29
  • Thanks, I'll give this a try right away. Thanks for the heads on questions as well. – JustGettinStarted Aug 17 '15 at 17:40
  • I've shortened my question and moved part of it elsewhere. – JustGettinStarted Aug 17 '15 at 18:52
  • @JustGettinStarted, did `R = ` work for you? If so please click the checkmark next to the answer so that others know that this part of the question is closed. – scribbles Aug 17 '15 at 18:54
  • Don't think `R =` is not performing what i need, in that the bootstrapped estimates from `se = "boot", R=10000, bsmethod="mcmb"` are not the subjects of the anova test. Rather `se="nid"` are the subject of the test. I cant report one set of estimates but do the anova on another – JustGettinStarted Aug 17 '15 at 19:09
0

Consulted with a colleague, and he confirmed that it was unlikely that Hinv and J could be 'reverse' computed from bootstrapped estimates. However we resolved that estimates from different taus could be compared using Wald test as follows.

From object rqs produced by

q.Summary <- summary(Q.mod, se="boot", R=10000, bsmethod="mcmb", covariance=TRUE)

you extract the bootstrapped Beta values for variable of interest in this case VAR, the first covariate in FML for each tau

boot.Bs <- sapply(q.Summary, function (x) x[["B"]][,2])
B0 <- coef(summary(lm(FML, data)))[2,1] # Extract liner estimate data linear estimate

Then compute wald statistic and get pvalue with number of quantiles for degrees of freedom

Wald <- sum(apply(boot.Bs, 2, function (x) ((mean(x)-B0)^2)/var(x)))
Pvalue <- pchisq(Wald, ncol(boot.Bs), lower=FALSE)

You also want to verify that bootstrapped Betas are normally distributed, and if you're running many taus it can be cumbersome to check all those QQ plots so just sum them by row

qqnorm(apply(boot.Bs, 1, sum))
qqline(apply(boot.Bs, 1, sum), col = 2)

This seems to be working, and if anyone can think of anything wrong with my solution, please share