0

using R, I m performing a backtest on a time series by using quantile regression (quantreg::rq) on a number of features. These features are then selected based on a condition such as p-values <= 5%.

If I run the routine multiple times, I always end up with the same betas/coefficients on the features, however p-values are unstables, and this happens ceteris paribus! Ex: for feature 1, there might be the case that it is selected on a first trial and then it is dropped because p values are, respectively, 0.049 and 0.0536.

Nothing has changed in the input DF (checked multiple times)! This happens still when I set a specific set.seed.

Does it relate to some specific random sampling used to determine the level of significance automatically by R?

Here a reproducible example for what I mean:

require(quantreg)
data(engel)
set.seed(12345)

QR_taus <- c(.10, .2, 0.33, .50, 0.66, .80, .90)


for (i in 1:5){
  mod <- rq(foodexp ~ income,
            tau = QR_taus,
            data = engel)
  summ <- summary(mod, se = "boot")
  
  Residuals_QR <- summ[[1]][["coefficients"]]
  assign(paste("Residuals_QR_",i,sep=""),Residuals_QR)
}

Residuals_QR_1
Residuals_QR_2
Residuals_QR_3
Residuals_QR_4
Residuals_QR_5

By comparing results in the Residuals_QR_[], we can see we end up with the same coefficients but different st error and, thus, t value and p values.

How to fix this instability?

Tks

I ran the quantreg::rq on the same data multiple times and, as expected, I get the same beta coefficients; however this is not the case for the associated p-values. This is something unexpected.

  • I'm sorry but isn't it obvious? You specify that standard errors should be derived by bootstrap with involves an RNG. If you want it to return the exact same standard errors (and p-values) in each iteration, you need to set the seed in each iteration. – Roland Mar 08 '23 at 10:01
  • Yes Roland, you re right. I "missed" that component. Tks a lot for your comment. – user12899748 Mar 08 '23 at 10:05

1 Answers1

0

SOLVED

link: https://cran.r-project.org/web/packages/quantreg/quantreg.pdf

The instability of standard errors comes from the boot component in

summary(mod, se = "boot")

"boot" which implements one of several possible bootstrapping alternatives for estimating standard errors including a variate of the wild bootstrap for clustered response.

Others method available in summary provide the same p-values if ran multiple times.