I have a data.table representing three variables:
- Start_time (commencement of a certain procedure)
- Time (the actual time)
- Value (some value ascribed to each procedure at a given time)
I've formed a dummy set of data to represent this below:
libarary(data.table)
library(dplyr)
Start_number <- 3
Duration <- 4
Table <- rbindlist(lapply(seq_len(Start_number),function(x) data.table(Start_time = x,
Time = as.integer(x + seq_len(Duration) - 1)) %>% mutate(Value = 2)))
(I've just put the "Value" column to be '2', however this will vary in my actual data set)
I need to create a table which has the following two columns of output:
- Time - each value from smallest to largest
- Total_FV - the sum of the 'Value' column subject to the following conditions:
Condition 1: We sum up the 'Value' column for rows where 'Time' > current row's value of 'Time' (e.g. if 'Time' = 1, we'd only sum up rows where 'Time' = 2, 3, 4, ...)
Condition 2: Where Condition 1 is met, we exclude summing up rows where 'Start_time' > 'Time'.
I'll explain what the intended output is to hopefully make more sense of the above.
Target_output <- data.table(Time = seq_len(Duration),Total_FV = c(6,10,12,6,2,0))
Row 1 of output: Need to sum up for all values of Time > 1, but only where Start_time <= 1. From the "Table" data.table, this gives us rows 2, 3 and 4 which sum up to 6.
Row 2 of output: Need to sum up for all values of Time > 2, but only where Start_time <= 2. This gives us rows 3 and 4 where Start_time = 1, and then rows 6 - 8 where Start_time = 2. In total, this gives us 10.
Row 3 of output: Need to sum up for all values of Time > 3, but only where Start_time <= 3. This gives us row 4 where Start_time = 1, rows 7 - 8 where Start_time = 2, and then rows 10 - 12 for Start_time = 3. In total, this gives us 12.
...and so on
I tried to achieve this by conducting using non-equi joins with 'Table' on itself as follows:
Result <- Table[Table,on = .(Start_time >= Time,Time > Time),.(Total_FV = sum(Value)),by = Time]
Unfortunately this looks to be a long way off.
Can someone please see if they have a data.table-like approach to this, presumably using joins?
Thanks!