Consider this example
mydata <- data_frame(ind_1 = c(NA,NA,3,4),
ind_2 = c(2,3,4,5),
ind_3 = c(5,6,NA,NA),
y = c(28,34,25,12),
group = c('a','a','b','b'))
> mydata
# A tibble: 4 x 5
ind_1 ind_2 ind_3 y group
<dbl> <dbl> <dbl> <dbl> <chr>
1 NA 2 5 28 a
2 NA 3 6 34 a
3 3 4 NA 25 b
4 4 5 NA 12 b
Here I want, for each group
, regress y
on whatever variable is not missing in that group, and store the corresponding lm
object in a list-column
.
That is:
- for group
a
, these variables correspond toind_2
andind_3
- for group
b
, they correspond toind_1
andind_2
I tried the following but this does not work
mydata %>% group_by(group) %>% nest() %>%
do(filtered_df <- . %>% select(which(colMeans(is.na(.)) == 0)),
myreg = lm(y~ names(filtered_df)))
Any ideas? Thanks!