1

I was trying to run a regression models on multiple subgroups of a dataframe using purrr::map_dfr(), but somehow I get this somewhat weird error.

library(dplyr)
library(purrr)

# Create some data
test_df = map_dfr(seq_len(5), ~mtcars, .id = 'group')

# Run regression on subgroups
map_dfr(seq_len(5),
                ~ function(.x){
                  glm(am ~ mpg + cyl + disp + hp + drat + wt + qsec + vs + gear + carb, 
                            family = binomial, 
                            data = test_df[group == .x,]) %>% 
                    coefficients()
                },
                .id = 'group')

Error: Argument 1 must be a data frame or a named atomic vector.
Run `rlang::last_error()` to see where the error occurred.

Any suggestion will be appreciated.

Miao Cai
  • 902
  • 9
  • 25

1 Answers1

1

If we are using function(x), there is no need for ~ or viceversa. It is a lambda function compact syntax in tidyverse

map_dfr(seq_len(5),
                ~ {
                  glm(am ~ mpg + cyl + disp + hp + drat + wt + qsec + vs + gear + carb, 
                            family = binomial, 
                            data = test_df[test_df$group == .x,]) %>% 
                    coefficients()
                },
                .id = 'group')

-output

# A tibble: 5 x 12
  group `(Intercept)`    mpg   cyl   disp    hp  drat    wt  qsec    vs  gear  carb
  <chr>         <dbl>  <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1             -11.6 -0.881  2.53 -0.416 0.344  23.2  7.44 -7.58 -47.0  42.9 -21.6
2 2             -11.6 -0.881  2.53 -0.416 0.344  23.2  7.44 -7.58 -47.0  42.9 -21.6
3 3             -11.6 -0.881  2.53 -0.416 0.344  23.2  7.44 -7.58 -47.0  42.9 -21.6
4 4             -11.6 -0.881  2.53 -0.416 0.344  23.2  7.44 -7.58 -47.0  42.9 -21.6
5 5             -11.6 -0.881  2.53 -0.416 0.344  23.2  7.44 -7.58 -47.0  42.9 -21.6

NOTE: output is the same as the input example was using the same data

akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Ok. Thank you! Knowing what the issue is, that error message does not make much sense to me. – Miao Cai Jul 27 '21 at 00:44
  • @MiaoCai that error message was kind of deceiving. You got that error because the value you are geetting from your code is a function, You can check it by using `map` instead of `map_dfr` the `_dfr` triggered that error because it expects a data.frame eetc. – akrun Jul 27 '21 at 00:46