using R, I m performing a backtest on a time series by using quantile regression (quantreg::rq) on a number of features. These features are then selected based on a condition such as p-values <= 5%.
If I run the routine multiple times, I always end up with the same betas/coefficients on the features, however p-values are unstables, and this happens ceteris paribus! Ex: for feature 1, there might be the case that it is selected on a first trial and then it is dropped because p values are, respectively, 0.049 and 0.0536.
Nothing has changed in the input DF (checked multiple times)! This happens still when I set a specific set.seed.
Does it relate to some specific random sampling used to determine the level of significance automatically by R?
Here a reproducible example for what I mean:
require(quantreg)
data(engel)
set.seed(12345)
QR_taus <- c(.10, .2, 0.33, .50, 0.66, .80, .90)
for (i in 1:5){
mod <- rq(foodexp ~ income,
tau = QR_taus,
data = engel)
summ <- summary(mod, se = "boot")
Residuals_QR <- summ[[1]][["coefficients"]]
assign(paste("Residuals_QR_",i,sep=""),Residuals_QR)
}
Residuals_QR_1
Residuals_QR_2
Residuals_QR_3
Residuals_QR_4
Residuals_QR_5
By comparing results in the Residuals_QR_[], we can see we end up with the same coefficients but different st error and, thus, t value and p values.
How to fix this instability?
Tks
I ran the quantreg::rq on the same data multiple times and, as expected, I get the same beta coefficients; however this is not the case for the associated p-values. This is something unexpected.