23

I have a data like this:

df:

Group   Point
A       6000
B       5000
C       1000
D        100
F        70

Before I graph this df, I only like to remove values over than 95th percentile in my data frame. Any body tell me how to do that?

Stephane Rouberol
  • 4,286
  • 19
  • 18
user1471980
  • 10,127
  • 48
  • 136
  • 235

2 Answers2

52

Use the quantile function

> quantile(d$Point, 0.95)
 95% 
5800 

> d[d$Point < quantile(d$Point, 0.95), ]
  Group Point
2     B  5000
3     C  1000
4     D   100
5     F    70
GSee
  • 48,880
  • 13
  • 125
  • 145
13

Or using 'dplyr' library:

> quantile(d$Point, 0.95)
 95% 
5800

> df %>% filter(Point < quantile(df$Point, 0.95))
  Group Point
1     B  5000
2     C  1000
3     D   100
4     F    70
swojtasiak
  • 606
  • 6
  • 12
  • 10
    You can even go a little shorter without specifying `df`: `df %>% filter(Point < quantile(Point, 0.95))`. – kluu Apr 20 '18 at 09:44