I would like to be able to apply the a similar glm to several columns and am wondering if there is a neat way to do this with the new dplyr functionality
# data
set.seed(1234)
df <- data.frame(out1 = rbinom(100, 1, prob = 0.5),
out2 = c(rbinom(50, 1, prob = 0.2),
rbinom(50, 1, prob = 0.8)),
pred = factor(rep(letters[1:2], each = 50)))
Following the method laid out in this post I can use purrr::map
df %>%
select_if(is.numeric) %>%
map(~glm(. ~ df$pred,
family = binomial))
# output
# $out1
#
# Call: glm(formula = . ~ df$pred, family = binomial)
#
# Coefficients:
# (Intercept) df$predb
# 3.589e-16 -4.055e-01
#
# Degrees of Freedom: 99 Total (i.e. Null); 98 Residual
# Null Deviance: 137.6
# Residual Deviance: 136.6 AIC: 140.6
#
# $out2
#
# Call: glm(formula = . ~ df$pred, family = binomial)
#
# Coefficients:
# (Intercept) df$predb
# -1.153 2.305
#
# Degrees of Freedom: 99 Total (i.e. Null); 98 Residual
# Null Deviance: 138.6
# Residual Deviance: 110.2 AIC: 114.2
This returns a list and works just fine. But I was wondering if it was possible use the new dplyr 1.0.0
functionality to get a similar (or even neater) result? the sort of neat, row-by-row, data frame output returned by broom::glance
or broom::tidy
. Something along the lines of this blog post, but transposed to this version of the problem, and using across()
(potentially, at a guess)?
Also it would be nice if I could use starts_with("out") to select the columns that the glm() function is applied to.