Top_n return both max and min value - R

Question

is it possible for the top_n() command to return both max and min value at the same time?

Using the example from the reference page https://dplyr.tidyverse.org/reference/top_n.html

I tried the following

df <- data.frame(x = c(10, 4, 1, 6, 3, 1, 1)) 
df %>% top_n(c(1,-1)) ## returns an error

df <- data.frame(x = c(10, 4, 1, 6, 3, 1, 1)) 
df %>% top_n(1) %>%  top_n(-1) ## returns only max value

Thanks

why can't you use `summarise(max_x = max(x), min_x = min(x))`? — akash87, Jan 23 '20 at 21:46

tmfmnk · Accepted Answer · 2022-11-25T12:43:58.363

8

Not really involving top_n(), but you can try:

df %>%
 arrange(x) %>%
 slice(c(1, n()))

   x
1  1
2 10

Or:

df %>%
 slice(which(x == max(x) | x == min(x))) %>%
 distinct()

Or (provided by @Gregor):

df %>%
 slice(c(which.min(x), which.max(x)))

Or using filter():

df %>%
 filter(x %in% range(x) & !duplicated(x))

edited Nov 25 '22 at 12:43

answered Jan 23 '20 at 21:44

tmfmnk

38,881
4
47
67

4

Similar, `df %>% slice(which.max(x), which.min(x))` means you don't need the `distinct`. (Though it doesn't give you the option of omitting the `distinct`) – Gregor Thomas Jan 23 '20 at 22:16
@Gregor - reinstate Monica that's really neat, thanks :) – tmfmnk Jan 23 '20 at 22:18

IceCreamToucan · Answer 2 · 2020-01-23T22:34:36.330

3

Idea similar to @Jakub's answer with purrr::map_dfr

library(tidyverse) # dplyr and purrrr for map_dfr

df %>% 
  map_dfr(c(1, -1), top_n, wt = x, x = .)
#    x
# 1 10
# 2  1
# 3  1
# 4  1

edited Jan 23 '20 at 22:34

answered Jan 23 '20 at 22:14

IceCreamToucan

28,083
2
22
38

akrun · Answer 3 · 2020-01-23T21:50:06.413

2

Here is an option with top_n where we pass a logical vector based that returns TRUE for min/max using range and then get the distinct rows as there are ties for range i.e duplicate elements are present

library(dplyr)
df %>% 
   top_n(x %in% range(x), 1) %>%
   distinct
#   x
#1 10
#2  1

edited Jan 23 '20 at 21:50

answered Jan 23 '20 at 21:44

akrun

874,273
37
540
662

Jakub.Novotny · Answer 4 · 2020-01-23T21:57:26.037

2

I like @tmfmnk's answer. If you want to use top_n function, you can do this:

df <- data.frame(x = c(10, 4, 1, 6, 3, 1, 1))

bind_rows(
  df %>% top_n(1),
  df %>% top_n(-1)
)

# this solution addresses the specification in comments
df %>%
  group_by(y) %>%
  summarise(min = min(x),
            max = max(x),
            average = mean(x))

edited Jan 23 '20 at 21:57

answered Jan 23 '20 at 21:48

Jakub.Novotny

2,912
2
6
21

Thanks for all the answers! In fact, my data set was a bit more complicte, such that df <- data.frame(x = c(10, 4, 1, 6, 3, 1, 1, 34, 12, 18, 19, 8), y = sample(c("Group1", "Group2", "Group3"), 12, replace = T)) df %>% group_by(y) %>% summarise(average = mean(x)) I wanted to obtain both max and mean average and a proper label to which group it belongsa and it works fine with your solution – Yaahtzeck Jan 23 '20 at 21:50
It can be also changed slightly with `pipeR` i.e. `df %>>% (~ mx = top_n(., 1)) %>% top_n(-1) %>% bind_rows(mx, .)` – akrun Jan 23 '20 at 21:58
I tried to provide you with an alternative solution that would allow you to have min, max and average and also the label. You can do a row transformation with gather if you want. – Jakub.Novotny Jan 23 '20 at 21:58

Top_n return both max and min value - R

4 Answers4