In the context of variable selection, I'm trying to count the number of times a variable is selected over bootstrapped iterations. A simple version of the problem is provided below, along with my solution (answer
). But my solution quickly becomes unwieldy when dealing with 50 or 100 variables.
I have the set of variable names I would like to count over (pred
) so I thought it should be possible to create new columns based on those values and then detect the relevant string for each. But I can't figure out how without manually setting the column names and pasting the function. There must be a better way...
Any other solutions would be welcome, including tidyverse or purrr...
library(dplyr)
df <- mtcars
n <- nrow(df)
pred <- colnames(df)[2:length(df)]
target <- "mpg"
mpg_formula <- paste(target, "~", paste(pred, collapse = "+"))
steplm <- data.frame()
bootnum <- 10
for (i in 1:bootnum) {
message("Fitting model ", i, " out of ", bootnum)
data.id <- sample(1:dim(df)[1], replace = T)
fit.lms <- step(lm(mpg_formula, data=df[data.id, ]),
direction = "backward",
trace = 0)
selected.vars <- paste(sort(names(coef(fit.lms)[-1])), collapse = ", ")
step.result <- data.frame("model" = selected.vars,
"nvar" = length(names(coef(fit.lms)[-1])))
steplm <- dplyr::bind_rows(steplm, step.result)
}
steplm %>%
transmute(
steplm %>%
transmute(
cyl = grepl(pattern = "cyl", x = model),
disp = grepl(pattern = "disp", x = model),
hp = grepl(pattern = "hp", x = model),
drat = grepl(pattern = "drat", x = model),
wt = grepl(pattern = "wt", x = model),
qsec = grepl(pattern = "qsec", x = model),
vs = grepl(pattern = "vs", x = model),
am = grepl(pattern = "am", x = model),
gear = grepl(pattern = "gear", x = model),
carb = grepl(pattern = "carb", x = model)
) -> answer
This produces the following data.frame (or matrix), from which I can just sum the columns to get the values I want (or do matrix operations to get pairwise and joint dependencies between terms). This is just to point out the matrix format is needed for the next step...
cyl disp hp drat wt qsec vs am gear carb
TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE
FALSE TRUE FALSE TRUE FALSE FALSE TRUE TRUE FALSE TRUE
TRUE TRUE TRUE FALSE FALSE FALSE TRUE TRUE FALSE FALSE
TRUE FALSE FALSE TRUE TRUE TRUE FALSE TRUE FALSE TRUE
TRUE TRUE FALSE FALSE TRUE FALSE TRUE TRUE TRUE TRUE
TRUE FALSE FALSE TRUE TRUE FALSE FALSE FALSE TRUE FALSE
FALSE FALSE TRUE FALSE FALSE FALSE TRUE TRUE FALSE TRUE
FALSE TRUE FALSE FALSE TRUE TRUE TRUE TRUE TRUE FALSE
TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE
TRUE FALSE TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE