11

I often find myself making incorrect choices in variables names when using purrr.

For example, take the code on the github page of purrr.

library(purrr)

mtcars %>%
  split(.$cyl)

in split(.$cyl) I often make the mistake of using split(cyl). This seems to be the most obvious choice as it is consistent with other tidyverse commands such as select(cyl).

My question is why the .$ in front of the variable name.

adibender
  • 7,288
  • 3
  • 37
  • 41
Alex
  • 2,603
  • 4
  • 40
  • 73
  • 1
    `split()` is a base R function and does not follow the logic and principles build into `tidyverse` functions. There is no `purrr` function calls in your sample code – Kresten Apr 11 '19 at 09:19

1 Answers1

11

The . represents the data object and by using $ it is extracting the column. It can also take in

mtcars %>%
    split(.[['cyl']]

With in the mutate/summarise/group_by/select/arrange etc. we can simply pass the column names, but there it is different as split is a base R function and it cannot find the environment of the dataset where the column 'cyl' is unless we extract the column

One option we can do in tidyverse is to nest all other variables except 'cyl' i.e.

mtcars %>%
    nest(-cyl) 

Now, we have a list column named 'data' which contains all the other columns as a list of 'data.frame`s


With new versions of dplyr (0.8.1 tested), there is group_split as commented by @Moody_Mudskipper

mtcars %>%
       group_split(cyl)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • I thought I understood the concept so I tried to take things a little step further and did the following: `mtcars %>% nest(-cyl) %>% mutate(model = map(data,cyl_model)) %>% map(.x = .$model , .f = summary)` and got the error "Error in if (correlation) { : argument is not interpretable as logical". It seemed to me that `.$model` makes sense as I am passing it to `summary` which is a base R function. – Alex Mar 06 '18 at 13:26
  • What is `cyl_model` ? – akrun Mar 06 '18 at 13:28
  • Sorry, `cyl_model <- function(df) { lm(mpg ~ wt, data = mtcars) }` – Alex Mar 06 '18 at 13:28
  • 2
    @Alex You meant `mtcars %>% nest(-cyl) %>% mutate(Summary = map(data, ~ lm(mpg ~ wt, data = .x) %>% summary)) %>% .$Summary` – akrun Mar 06 '18 at 13:30
  • 1
    @Alex `mtcars` is the whole dataset and if you pass that as object, it will look for the whole column instead of the splitted or nested one – akrun Mar 06 '18 at 13:34
  • 1
    @Alex In your anonymous function call `cyl_model <- function(df) { lm(mpg ~ wt, data = df) }` you are calling `df` and then `data = mtcars` it should be `data = df` – akrun Mar 06 '18 at 13:36
  • 2
    @akrun to get a similar output here we can use `dplyr::group_split`, `mtcars %>% group_split(cyl)`. The elements won't be named though. – moodymudskipper Jun 21 '19 at 16:10