1

I have a data that looks as follow:

toy.dat <- data.frame(group = c(rep("A_0", 3), rep("A_1", 2), 
                                rep("B_0", 3) , rep("B_1", 3)))
toy.dat$letters <- c("A", 'B', "C", "A", "D", "C", "E", "F", "A", "B", "F")

toy.dat %>% 
  group_by(group) %>% 
  summarise(letters = list(letters), num = n()) %>%
  mutate(group_number = gsub(".*_", "", group))


group   letters            num_elements  group_num   
A_0     c("A", "B", "C")       3              0        
A_1     c("A", "D")            2              1
B_0     c("C", "E", "F")       3              0
B_1     c("A", "B", "F")       3              1

I would like to group by group_numb and find the intersection of letters of those rows and add them to the data frame.

the output should give "c" for A_0 and B_0 and "A" for A_1 and B_1.

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
say.ff
  • 373
  • 1
  • 7
  • 21

1 Answers1

1

We may use reduce

library(dplyr)
library(purrr)
toy.dat %>% group_by(group) %>% summarise(letters = list(letters), num = n()) %>%
mutate(group_number = gsub(".*_", "", group)) %>% group_by(group_number) %>% mutate(intersect = list(reduce(letters, intersect))) %>%
 ungroup %>%
   mutate(nintersect = lengths(intersect))

-output

# A tibble: 4 × 6
  group letters     num group_number intersect nintersect
  <chr> <list>    <int> <chr>        <list>         <int>
1 A_0   <chr [3]>     3 0            <chr [1]>          1
2 A_1   <chr [2]>     2 1            <chr [1]>          1
3 B_0   <chr [3]>     3 0            <chr [1]>          1
4 B_1   <chr [3]>     3 1            <chr [1]>          1
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thanks, It seems working for my toy data. But for my real data set this error is popping up : Error: Problem with `mutate()` column `intersect`. ℹ `intersect = reduce(genes, intersect)`. ℹ `intersect` must be size 5 or 1, not 27. ℹ The error occurred in group 1: cluster_num = "0". – say.ff Oct 18 '21 at 20:20
  • 1
    @say.ff probably you have either none of more than one element as intersect, just wrap it in a list and it should solve – akrun Oct 18 '21 at 20:20
  • see the updated post – akrun Oct 18 '21 at 20:21
  • yeah It is working. I also need to add a column that calculates the number of intersections. how can I add that – say.ff Oct 18 '21 at 20:24
  • 1
    Oh sorry. I just forgot to do so. I voted the answer and thank you so much for that – say.ff Oct 18 '21 at 20:26
  • Do you have any idea how I can add the intersection number? – say.ff Oct 18 '21 at 20:27
  • @say.ff I already updated that. Can you refresh the page – akrun Oct 18 '21 at 20:29