How can I group by name and then only select the maximum value of each group in R?

Question

I have a data.frame with different observations with repeated names with different values each of them. Example:

Name Value Other_attributes
A    20    BLABLA1
B    40    BLABLA2
A    35    BLABLA3
B    10    BLABLA4
C    80    BLABLA5

I want a new data.frame with only the observations with the maximum values of each group.

Name Value Other_attributes
    B    40    BLABLA2
    A    35    BLABLA3
    C    80    BLABLA5

I hope to have been enough clear, thank you very much.

score 4 · Accepted Answer · answered Apr 20 '15 at 09:44

4

Try this with dplyr:

Assuming your data frame is df

library(dplyr)

df %>% group_by(Name) %>% filter(Value == max(Value)) %>% ungroup

Data

df <- data.frame(Name  = c("A", "B", "A", "B", "C"),
             Value=c(20, 40, 35, 10, 80),
             Other_attributes=c("BLABLA1", "BLABLA2", "BLABLA3", "BLABLA4", "BLABLA5"))

answered Apr 20 '15 at 09:44

dimitris_ps

5,849
3
29
55

1

You don't need `ungroup` here. Also, could do `df %>% group_by(Name) %>% slice(which.max(Value))` – David Arenburg Apr 20 '15 at 09:51
I always use `ungroup` at the end after a `group_by`, in the case they do some other manipulation on the data frame. Thanks for the `slice(which.max(Value))` – dimitris_ps Apr 20 '15 at 09:53
This works perfectly with the corrections of @DavidArenburg, thanks to both of you. And sorry for the repeated question, I checked before but I was not able to find the answer. – Laura Apr 20 '15 at 11:21

score 2 · Answer 2 · edited Apr 20 '15 at 09:46

2

Use data.table.

library(data.table)
setDT(df)[, max(Value), by = Name]

edited Apr 20 '15 at 09:46

David Arenburg

91,361
17
137
196

answered Apr 20 '15 at 09:43

pogibas

27,303
19
84
117

1

If you already overriding `df` just use `setDT`. I've already edited your previous answers and you keep doing it for some reason. Also, this won't match the exact output. You need to use `which.max` with `.SD` here. – David Arenburg Apr 20 '15 at 09:47
It is good but I loose the other attributes... – Laura Apr 20 '15 at 09:51
1

@Laura `setDT(df)[, .SD[which.max(Value)], by = Name]` – David Arenburg Apr 20 '15 at 09:52

How can I group by name and then only select the maximum value of each group in R?

2 Answers2