I have a data table named features with the columns nightNo, HR, motion and angle. I'd like to get the rolling variance of the previous 600 points of the HR, motion and angle per nightNo. I've come up with the following function to do this:
features <- data.table(nightNo=c(1,1,1,1,1,1,1,2,2,2,2,2,2,2),
HR=c(1,2,3,4,5,6,7,8,9,10,11,12,13,14),
motion=c(14,13,12,11,10,9,8,7,6,5,4,3,2,1),
angle=c(2,4,6,8,10,12,14,16,18,20,22,24,26,28))
# For the example I'll use a window of 6 instead of 600
window = 6
features[, c("HR_Variance", "motion_Variance", "angle_Variance") :=
list(rollapply(HR, window, var, partial=TRUE, align = "right"),
rollapply(motion, window, var, partial=TRUE, align = "right"),
rollapply(angle, window, var, partial=TRUE, align = "right")), by=nightNo ]
# nightNo HR motion angle HR_Variance motion_Variance angle_Variance
# 1: 1 1 14 2 NA NA NA
# 2: 1 2 13 4 0.500000 0.500000 2.000000
# 3: 1 3 12 6 1.000000 1.000000 4.000000
# 4: 1 4 11 8 1.666667 1.666667 6.666667
# 5: 1 5 10 10 2.500000 2.500000 10.000000
# 6: 1 6 9 12 3.500000 3.500000 14.000000
# 7: 1 7 8 14 3.500000 3.500000 14.000000
# 8: 2 8 7 16 NA NA NA
# 9: 2 9 6 18 0.500000 0.500000 2.000000
# 10: 2 10 5 20 1.000000 1.000000 4.000000
# 11: 2 11 4 22 1.666667 1.666667 6.666667
# 12: 2 12 3 24 2.500000 2.500000 10.000000
# 13: 2 13 2 26 3.500000 3.500000 14.000000
# 14: 2 14 1 28 3.500000 3.500000 14.000000
The result is correct, but since I have a large dataset it runs forever. I've also made other similair features that use runmeans and sapplys over the same 600 window per nightNo and they run in a reasonable time, which makes me think either rollapply or the variance function is very slow. Is there a way to make this code more efficient, possibly by changing the var or the rollapply function?