1

I need to accomplish a wrangling task with tidyr/dplyr as part of a %>% pipe. That is, without assigning data to helper objects. I have the following trb tibble as a given:

library(tibble)

trb <-
  tribble(~name,     ~type,    ~dat,
        "john",    "cat",    mtcars,
        "john",    "spider", Puromycin,
        "amanda",  "dog",    ToothGrowth,
        "chris",   "wolf",   PlantGrowth,
        "annie",   "lion",   women,
        "richard", "frog",   trees,
        "liz",     "horse",  USArrests,
        "raul",    "snake",  iris,
        "kate" ,   "bear",   quakes) 

and I want to do a 2-step wrangling (not necessarily in the following order):

  1. lump together john's dat data frames into a named list (in which names will come from type); and
  2. shift john's information to leftmost while nesting the data of the others.

The desired output should therefore be:

desired_output <-
  tribble(~dat_john,                                  ~other_people,
          list("cat" = mtcars, "spider" = Puromycin), trb %>% dplyr::filter(name != "john")
        )

As noted above, it's important to me to get from trb to desired_output using %>% only. Any ideas?

Jon Spring
  • 55,165
  • 4
  • 35
  • 53
Emman
  • 3,695
  • 2
  • 20
  • 44
  • So you want a list in the first column `dat_john` but a tibble in the other column? I'm unclear on what you're looking for to generate the result in a more direct way than your code already does. – Jon Spring Jan 11 '22 at 22:36
  • So to clarify/confirm, you want to get a different data type for the two columns: a list of named tibbles on the left, and a nested tibble on the right? – Jon Spring Jan 12 '22 at 08:20
  • @JonSpring, yes, correct. exactly like `desired_output`. – Emman Jan 12 '22 at 08:23

2 Answers2

2

Maybe something like this? It first categorizes the data as john or not, then nests all the data for each category into one list, then pivots those two categories wide.

library(tidyr); library(dplyr)
trb %>%
  mutate(column = if_else(name == "john", "dat_john", "other people")) %>%
  nest(-column) %>%
  pivot_wider(names_from = column, values_from = data) %>%
  # from @ekoam's answer, to convert this column to named list
  mutate(dat_john = with(dat_john[[1L]], list(setNames(dat, type))))
Jon Spring
  • 55,165
  • 4
  • 35
  • 53
1

It is possible to achieve what you want via a sequence of pipelines. But I am not sure why you want to do this. Note that you need to manually assign "john" as the first level and rearrange the dataframe. Otherwise, if "john" is not the first entry, you won't get him to the leftmost after pivot_wider.

library(dplyr)
library(tidyr)

trb %>% 
  group_by(id = factor(name != "john", labels = c("dat_john", "other_people"))) %>% 
  arrange(id) %>% # use factor and arrange to ensure that john is always the first level
  nest(data = -id) %>% 
  pivot_wider(names_from = id, values_from = data) %>% 
  mutate(dat_john = with(dat_john[[1L]], list(setNames(dat, type))))

Output

# A tibble: 1 x 2
  dat_john         other_people    
  <list>           <list>          
1 <named list [2]> <tibble [7 x 3]>
ekoam
  • 8,744
  • 1
  • 9
  • 22