Slice the rows with the earliest datetime

Question

I have a dataframe and I want to filter only rows that corresponds to the earliest datetime. The following is my dataframe:

library(lubridate)
df<-data.frame(ID=rep(1:2,5:6),DATETIME_OF_PROCEDURE=rep(c(ymd_hms("2013-03-16 03:00:42"),
                                                           ymd_hms("2013-02-12 08:00:42"),
                                                           ymd_hms("2014-06-19 01:00:42"),
                                                           ymd_hms("2014-01-22 01:00:42"),
                                                           ymd_hms("2014-06-12 02:00:40")),
                                                         c(3,2,2,2,2)))

I would like to get:

#     > df
#    ID DATETIME_OF_PROCEDURE
# 1   1   2013-02-12 16:00:42
# 2   1   2013-02-12 16:00:42
# 3   2   2014-01-22 09:00:42
# 4   2   2014-01-22 09:00:42

I tried the following but it gives an empty dataframe.

df %>% 
  arrange(DATETIME_OF_PROCEDURE) %>% 
  group_by(ID) %>%
  slice(min(DATETIME_OF_PROCEDURE))

I also tried slice(first(DATETIME_OF_PROCEDURE)) but again, it wouldn't work.

The correct syntax would be `df %>% arrange(DATETIME_OF_PROCEDURE) %>% group_by(ID) %>% slice(1L)` but `slice` will not capture ties. — Sotos, May 08 '17 at 09:29

Iaroslav Domin · Accepted Answer · 2017-05-08T09:22:01.760

2

df %>% 
    group_by(ID) %>% 
    filter(DATETIME_OF_PROCEDURE == min(DATETIME_OF_PROCEDURE))

edited May 08 '17 at 09:22

answered May 08 '17 at 09:16

Iaroslav Domin

2,698
10
19

score 2 · Answer 2 · answered May 08 '17 at 09:30

Since you already arrange, slice(1) works fine:

df %>% 
  arrange(DATETIME_OF_PROCEDURE) %>% 
  group_by(ID) %>%
  slice(1)

Other use which.min, since filter takes indices:

df %>% 
  group_by(ID) %>%
  slice(which.min(DATETIME_OF_PROCEDURE))

Be careful of ties in your data. Use filter if you want to retain all ties.

Both return the same:

Source: local data frame [2 x 2]
Groups: ID [2]

     ID DATETIME_OF_PROCEDURE
  <int>                <dttm>
1     1   2013-02-12 09:00:42
2     2   2014-01-22 02:00:42

Slice the rows with the earliest datetime

2 Answers2