I would like to get the n highest and n lowest rows based on a variable in a dataframe. For example, for the 5 highest and 5 lowest rows of mpg in the mtcars dataframe, the output would be
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4
#> Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4
#> Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4
#> Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
#> Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4
#> Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
#> Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
#> Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2
#> Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
#> Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
I created this output with the following code, however this relies in selecting the right cutoff values for mpg, but I want it to select automatically the highest & lowest x rows.
library(tidyverse)
mtcars %>% arrange(mpg) %>% filter(!between(mpg, 15,26))
This is somewhat similar to this question: Extract rows with highest and lowest values from a data frame
However the answers there are kind of hard to understand, and I was wondering whether there's a better way that does not involve using slice_min()
and slice_max()
to create two separate dataframes and then merging them.
Edited to add: if there was a way of using mutate to create a variable which indicates the top 5 lowest and highest numbers like this, that would be great:
#> mpg cyl disp hp drat wt qsec vs am gear carb rank
#> 1 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 low
#> 2 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 low
#> 3 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 low
#> 4 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 low
#> 5 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 low
#> 6 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 high
#> 7 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 high
#> 8 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 high
#> 9 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 high
#> 10 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 high