6

is it possible for the top_n() command to return both max and min value at the same time?

Using the example from the reference page https://dplyr.tidyverse.org/reference/top_n.html

I tried the following

df <- data.frame(x = c(10, 4, 1, 6, 3, 1, 1)) 
df %>% top_n(c(1,-1)) ## returns an error

df <- data.frame(x = c(10, 4, 1, 6, 3, 1, 1)) 
df %>% top_n(1) %>%  top_n(-1) ## returns only max value

Thanks

Yaahtzeck
  • 217
  • 2
  • 13

4 Answers4

8

Not really involving top_n(), but you can try:

df %>%
 arrange(x) %>%
 slice(c(1, n()))

   x
1  1
2 10

Or:

df %>%
 slice(which(x == max(x) | x == min(x))) %>%
 distinct()

Or (provided by @Gregor):

df %>%
 slice(c(which.min(x), which.max(x)))

Or using filter():

df %>%
 filter(x %in% range(x) & !duplicated(x))
tmfmnk
  • 38,881
  • 4
  • 47
  • 67
  • 4
    Similar, `df %>% slice(which.max(x), which.min(x))` means you don't need the `distinct`. (Though it doesn't give you the option of omitting the `distinct`) – Gregor Thomas Jan 23 '20 at 22:16
  • @Gregor - reinstate Monica that's really neat, thanks :) – tmfmnk Jan 23 '20 at 22:18
3

Idea similar to @Jakub's answer with purrr::map_dfr

library(tidyverse) # dplyr and purrrr for map_dfr

df %>% 
  map_dfr(c(1, -1), top_n, wt = x, x = .)
#    x
# 1 10
# 2  1
# 3  1
# 4  1
IceCreamToucan
  • 28,083
  • 2
  • 22
  • 38
2

Here is an option with top_n where we pass a logical vector based that returns TRUE for min/max using range and then get the distinct rows as there are ties for range i.e duplicate elements are present

library(dplyr)
df %>% 
   top_n(x %in% range(x), 1) %>%
   distinct
#   x
#1 10
#2  1
akrun
  • 874,273
  • 37
  • 540
  • 662
2

I like @tmfmnk's answer. If you want to use top_n function, you can do this:

df <- data.frame(x = c(10, 4, 1, 6, 3, 1, 1))

bind_rows(
  df %>% top_n(1),
  df %>% top_n(-1)
)

# this solution addresses the specification in comments
df %>%
  group_by(y) %>%
  summarise(min = min(x),
            max = max(x),
            average = mean(x))
Jakub.Novotny
  • 2,912
  • 2
  • 6
  • 21
  • Thanks for all the answers! In fact, my data set was a bit more complicte, such that df <- data.frame(x = c(10, 4, 1, 6, 3, 1, 1, 34, 12, 18, 19, 8), y = sample(c("Group1", "Group2", "Group3"), 12, replace = T)) df %>% group_by(y) %>% summarise(average = mean(x)) I wanted to obtain both max and mean average and a proper label to which group it belongsa and it works fine with your solution – Yaahtzeck Jan 23 '20 at 21:50
  • It can be also changed slightly with `pipeR` i.e. `df %>>% (~ mx = top_n(., 1)) %>% top_n(-1) %>% bind_rows(mx, .)` – akrun Jan 23 '20 at 21:58
  • I tried to provide you with an alternative solution that would allow you to have min, max and average and also the label. You can do a row transformation with gather if you want. – Jakub.Novotny Jan 23 '20 at 21:58