I have the following data frame:
input.df <- dplyr::data_frame(x = rnorm(4),
y = rnorm(4),
`z 1` = rnorm(4))
I would like to do a multiple regression for each column with the other columns and extract the R-squared from each model. This means that I could run the following code:
summary(lm(x ~ ., data = input.df))
summary(lm(y ~ ., data = input.df))
summary(lm(`z 1` ~ ., data = input.df))
And note down the R-squared.
I'd like to automate this task and have two column data frame where the first column is the dependent variable and the second column is the R-squared.
This is what I've tried:
n <- ncol(input.df)
replicate(n, input.df, simplify = F) %>%
dplyr::bind_rows() %>%
dplyr::mutate(group = rep(names(.), each = nrow(.) / n)) %>%
dplyr::group_by(group) %>%
dplyr::do({
tgt.var <- .$group[1]
# How do I get the formula to interpret . as all variables?
lm(get(tgt.var) ~ ., data = .) %>%
broom::glance() %>%
dplyr::select(r.squared)
})
I've put a comment on the part I am stuck. I get the following error:
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels