Here's my interpretation of this.
You're starting with a data.frame
that looks like this. (I've added an extra out-of-order value to the data.frame
so my answer will be different from yours).
mydf <- data.frame(ID = c("ID1", "ID1", "ID2", "ID2", "ID3", "ID3"),
Date = c("Mar 01", "Mar 02", "Mar 03", "Mar 04", "Mar 05", "Mar 04"))
mydf
# ID Date
# 1 ID1 Mar 01
# 2 ID1 Mar 02
# 3 ID2 Mar 03
# 4 ID2 Mar 04
# 5 ID3 Mar 05
# 6 ID3 Mar 04
First, create actual "date" objects out of your "Date" column. I've assumed your date format is "mon day", so I've used "%b %d"
in strptime
. Since there is no year, the current year is assumed.
Date2 <- strptime(mydf$Date, format="%b %d") ## ASSUMES THE CURRENT YEAR
Date2
# [1] "2013-03-01" "2013-03-02" "2013-03-03" "2013-03-04" "2013-03-05" "2013-03-04"
Next, find a function that lets us order these dates by your "ID" variable. In base R, ave
does that pretty conveniently.
ave(as.numeric(Date2), mydf$ID, FUN = order)
# [1] 1 2 1 2 2 1
Use those values to subset rows with the first (lowest) value for each ID (that is, where the result is equal to "1").
mydf[ave(as.numeric(Date2), mydf$ID, FUN = order) == 1, ]
# ID Date
# 1 ID1 Mar 01
# 3 ID2 Mar 03
# 6 ID3 Mar 04