R/dplyr: add column for group membership

Asked Jun 04 '15 at 14:37

Active Jun 04 '15 at 14:39

Viewed 40 times

I'm using dplyr (and may need tidyr?), and have a list of "samples" and which "group" they belong to. Some samples can belong to multiple groups. I'd like to group_by and summarize this data such that I get a delimited character vector showing which groups a sample belongs to. I realize this isn't "tidy" data, but at this point it's at the final report stage, no more processing needed.

Here's what I have:

data_frame(sample=c("A", "B", "C", "A", "B", "D", "E"),
            group=c("g1", "g1", "g2", "g2", "g3", "g3", "g3"))
Source: local data frame [7 x 2]

  sample group
1      A    g1
2      B    g1
3      C    g2
4      A    g2
5      B    g3
6      D    g3
7      E    g3

Here's what I want:

data_frame(sample=c("A", "B", "C", "D", "E"),
            groups=c("g1; g2", "g1; g3", "g2", "g3", "g3"))
Source: local data frame [5 x 2]

  sample groups
1      A g1; g2
2      B g1; g3
3      C     g2
4      D     g3
5      E     g3

The delimiter doesn't have to be semicolon, but it doesn't cause CSV issues. Solution should allow me to choose what I want there.

edited Jun 04 '15 at 14:39

Frank

66,179
8
96
180

asked Jun 04 '15 at 14:37

Stephen Turner

2,574
8
31
44

1

Try `df1 %>% group_by(sample) %>% summarise(groups= paste(group, collapse="; "))` – akrun Jun 04 '15 at 14:41
1

This one has a dplyr answer, not sure if you regard it as a duplicate: http://stackoverflow.com/q/28998990/1191259 – Frank Jun 04 '15 at 14:42
Thanks. Search strategy is half the battle. – Stephen Turner Jun 04 '15 at 14:52

R/dplyr: add column for group membership

0 Answers0