Im new to the data.table "scene" so I apologize if my question is simplistic. I am constantly in the position to where I have to apply some analysis or subset some data grouped by a Unique ID. Typically I have about 1,000 rows per Unique ID with about 30 Unique IDs. So, Ive been advised to switch to data.table instead of trying to figure out the lapply or sapply or plyr package.
Here's a sample of my type of data
structure(list(ID = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L,
3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), dt = structure(c(1138366975,
1138370472, 1138374064, 1138377669, 1138381264, 1138384873, 1138388503,
1138399312, 1138402842, 1138406507, 1138413700, 1138417261, 1138420848,
1138424444, 1138428071, 1138431695, 1138435287, 1138438938, 1138442428,
1138446098), class = c("POSIXct", "POSIXt"), tzone = "GMT")), .Names = c("ID",
"dt"), row.names = c(NA, -20L), class = "data.frame")
I convert this into a data.table
X = data.table(test)
set my "key" to be the Individual
setkey(X,ID)
Then the goal is to calculate in HOURS(at the moment either I am hoping will be easy) the time difference. So to take Time2-Time1 to get the hours and minutes between each successive location BY Individual (in this case ID).
X[, diff:=c(NA,diff(dt)),by = ID]
The diff command here calculates it in minutes, but I would like to convert/round this to hours in the most efficient way while still keeping the value as a POSIX or time object. I know I could likely create yet another column and divide diff by 60. But I was hoping that there was some way to just type "hours"
or "minutes"
or something somewhere. As I am not understanding how data.table handles time.
Ive tried doing this in a data.frame
using a for
loop using difftime
command, but its so cumbersome and linking the data back to the original dataframe is confusing to me as I am not proficient with for
loops.
Once I get the data into hours, I want to select only the data that is 0.5 hours apart, then 4 hours apart then 12 hours apart. Which I haven't figured out how to do yet in data.table