0

I have a dataframe with two date columns. All the dates in the Date2 column are the same.

Date1<-c("16/11/2007 17:00:00", "17/11/2007 01:02:00", "17/11/2007 05:00:00", "17/11/2007 09:00:00", "17/11/2007 13:00:00", "17/11/2007 17:00:00", "18/11/2007 01:00:00", "18/11/2007 05:00:00")
Date2<-rep("17/11/2007 17:00:00",times=8)
Data<-data.frame(Date1,Date2)
Data

  Date1               Date2
 16/11/2007 17:00:00 17/11/2007 17:00:00
 17/11/2007 01:02:00 17/11/2007 17:00:00
 17/11/2007 05:00:00 17/11/2007 17:00:00
 17/11/2007 09:00:00 17/11/2007 17:00:00
 17/11/2007 13:00:00 17/11/2007 17:00:00
 17/11/2007 17:00:00 17/11/2007 17:00:00
 18/11/2007 01:00:00 17/11/2007 17:00:00
 18/11/2007 05:00:00 17/11/2007 17:00:00

I want to subset my data such that all dates in Date1 are equal to or LATER than the dates in Date2. Here's what the desired output would look like:

Data[4:8,]

  Date1               Date2
 17/11/2007 17:00:00 17/11/2007 17:00:00
 18/11/2007 01:00:00 17/11/2007 17:00:00
 18/11/2007 05:00:00 17/11/2007 17:00:00

Any help would be appreciated!

r2evans
  • 141,215
  • 6
  • 77
  • 149
Cam
  • 449
  • 2
  • 7

2 Answers2

3

We convert the 'Date' columns to Datetime class and filter

library(dplyr)
library(lubridate)
Data %>% 
    mutate(across(starts_with('Date'), dmy_hms)) %>% 
    filter(Date1 >= Date2)
#         Date1               Date2
#1 2007-11-17 17:00:00 2007-11-17 17:00:00
#2 2007-11-18 01:00:00 2007-11-17 17:00:00
#3 2007-11-18 05:00:00 2007-11-17 17:00:00

In base R, this can be done with as.POSIXct

subset(Data, as.POSIXct(Date1, format = "%d/%m/%Y %H:%M:%S") >= 
           as.POSIXct(Date2, format = "%d/%m/%Y %H:%M:%S"))
akrun
  • 874,273
  • 37
  • 540
  • 662
1

You can just use filter() to only "filter in" the data that meets your condition after converting the date columns into the date type. @Akrun's answer is the most concise, but if you want to keep the Date1 and Date2 columns as strings, you can instead create new columns to use for filtering and then remove them after.

Data %>%
    mutate(Date1_datetime_type = dmy_hms(Date1), #new columns to use for filtering
           Date2_datetime_type = dmy_hms(Date2)) %>%
    filter(Date1_datetime_type >= Date2_datetime_type) %>% #your condition
    select(Date1, Date2) #selecting only the original columns

And you'll get

#                Date1               Date2
#1 17/11/2007 17:00:00 17/11/2007 17:00:00
#2 18/11/2007 01:00:00 17/11/2007 17:00:00
#3 18/11/2007 05:00:00 17/11/2007 17:00:00

Edit: Wrote too fast and forgot to change the types of the date columns. Also added a use case (leaving the original data in the same data type).

spark
  • 33
  • 5