I have two large datasets with the only shared feature being a numerical timestamp. I'd like to merge the data frames by this timestamp, but the frequency of data collection doesn't match exactly so I need to allow it to merge with the nearest possible match.
As a simplified example, here's a small data set with a value column, some event, and an ID:
a<-c("150", "164", "175", "183", "195", "200", "205","213")
b<-c("start1","end1","start2", "end2", "start1", "end1", "start2", "end2")
c<-c("A","A","A", "A", "B", "B", "B", "B")
(data<-data.table(value = a, event = b, ID = c))
And I'd like to be able to merge this "data" with this numerical series ("times") by the value column:
(times<-data.frame(value = c(seq(from = 150, to = 213, by = 3))))
So that they merge by the nearest approximate match in the value column to produce this final data frame:
agoal<-c(seq(from = 150, to = 213, by = 3))
bgoal<-c("start1","","","","","end1","", "",
"start2", "", "", "end2", "", "", "",
"start1", "", "end1", "start2", "", "", "end2")
cgoal<-c("A","","","","","A","", "",
"A", "", "", "A", "", "", "",
"B", "", "B", "B", "", "", "B")
(goal<-data.frame(value = agoal, event = bgoal, ID = cgoal))
Is there a way to do this, especially for a very large dataset (so it doesn't crash R)?