I record CO2 in df2 and have a list of experiment start and end times in d:
data.frame df2
that contains continuous CO2 measurements over time.
df2<-data.frame(CO2.ppm.=sample(300:500,72,replace=TRUE),Dev.Date.Time=seq(
from=as.POSIXct("2012-1-1 0:00", tz="BST"),
to=as.POSIXct("2012-1-3 23:00", tz="BST"),
by="hour"
) )
I have a data.frame
df1
with a continuous time variable called: Dev.Date.Time
, a column called ExperimentID
and the type of ExperimentType
that was recorded. Note, there's a chunk of time where no experiment was taking place but don't need to remove it.
df1<-data.frame(ExperimentID=rep(1:12,each=6),ExperimentType=rep(c("IV","NoExperiment","Obs"),each=24),Dev.Date.Time=seq(
from=as.POSIXct("2012-1-1 0:00", tz="BST"),
to=as.POSIXct("2012-1-3 23:00", tz="BST"),
by="hour"
) )
I have then created another data.frame d
with start and end times of each experiments.
startTime<-aggregate(data=df1,Dev.Date.Time~ExperimentID+ExperimentType,head,1)
endTime<-aggregate(data=df1,Dev.Date.Time~ExperimentID+ExperimentType,tail,1)
d<-inner_join(startTime, endTime, by=c("ExperimentID","ExperimentType"))
I'd like to create a column in df2 called ExperimentID and another one called ExperimentType based on the start and stop times that I found in d
I'm trying the following that makes the breaks but I can't work out how to make the labels match. Any thoughts are much appreciated.
Originally I thought about using cut
. While it made the breaks I wanted I wasn't any closer to labelling them by ExperimentID.
breakz <- as_tibble(lubridate::ymd_hms(d$Dev.Date.Time.x,d$Dev.Date.Time.y))
breakz<-dplyr::arrange(breakz,value)
df1$ActivityID<-cut(df1$Dev.Date.Time,breaks=unique(breakz$value), labels = c(d$ExperimentID,d$ExperimentType)
EDIT:
Based on suggestions in the comments I'm trying fuzzyjoin because in reality the time-stamps don't match exactly. So I need merge by an interval.
require(fuzzy join)
df3<-(fuzzy_right_join(
d, df2,
by = c(
"StartTime" = "Dev.Date.Time",
"EndTime" = "Dev.Date.Time"
),
match_fun = list( `>=`, `<=`)
))
Produces NA in all df3$ExperimentID. Any thoughts?