0

This is a subset of my data

    # | Event | Time
    1    A       22:00:00
    2    B       22:00:10
    3    B       22:00:20
    4    B       22:00:30
    5    C       22:00:40
    6    B       22:00:10
    7    B       22:00:20
    8    B       22:00:30
    9    A       22:00:40

I want to calculate the duration of a continuous occurrence of a same event – so, say I want the duration of event B, the output should be

    # | Event | Time      | Duration
    1    A       22:00:00   NA
    2    B       22:00:10   20 Secs
    3    B       22:00:20   NA
    4    B       22:00:30   NA
    5    C       22:00:40   NA
    6    B       22:00:10   20 Secs
    7    B       22:00:20   NA
    8    B       22:00:30   NA
    9    A       22:00:40   NA

I've tried (variations of) solutions from: Using conditional statements with difftime in R and Calulcate running difference of time using difftime on one column of timestamps

But I can't seem to get what I'm trying to achieve. Hope I can get some help! Thanks!

cookiedookie
  • 389
  • 1
  • 5
  • 17
  • Try `library(data.table); setDT(df1)[, grp := rleid(Event)][, Time := chron::times(Time)][Event == "B", Duration := c(as.numeric(max(Time)-min(Time))*24*60*60, rep(NA, .N-1)), grp]` – akrun Jul 07 '17 at 07:06

1 Answers1

1

We can use data.table. Convert the 'data.frame' to 'data.table' (setDT(df1)), create a grouping variable based on the run-length-id of 'Event' i.e. based on same adjacent elements ('grp'), convert the 'Time' to POSIXct , grouped by 'grp', specifying the 'i' with the logical condition (Event == "B"), find the difference of the last and first 'Time1' using difftime, then we assign the values of the 'Duration' that is not the first observation for each 'grp' to NA.

library(data.table)
setDT(df1)[, grp := rleid(Event)][, Time1 := as.POSIXct(Time, format = 
       "%H:%M:%S")][Event == "B", 
     Duration := as.numeric(difftime(Time1[.N], Time1[1], unit = "secs")), grp]
df1[df1[, .I[seq_len(.N) != 1], grp]$V1, Duration := NA][, c("Time1", "grp") := NULL][]
#   No Event     Time Duration
#1:  1     A 22:00:00       NA
#2:  2     B 22:00:10       20
#3:  3     B 22:00:20       NA
#4:  4     B 22:00:30       NA
#5:  5     C 22:00:40       NA
#6:  6     B 22:00:10       20
#7:  7     B 22:00:20       NA
#8:  8     B 22:00:30       NA
#9:  9     A 22:00:40       NA

data

df1 <- structure(list(No = 1:9, Event = c("A", "B", "B", "B", "C", "B", 
 "B", "B", "A"), Time = c("22:00:00", "22:00:10", "22:00:20", 
 "22:00:30", "22:00:40", "22:00:10", "22:00:20", "22:00:30", "22:00:40"
 )), .Names = c("No", "Event", "Time"), class = "data.frame", row.names = c(NA, 
-9L))
Community
  • 1
  • 1
akrun
  • 874,273
  • 37
  • 540
  • 662