I have a data.table
with UTC date-time-stamp records spanning multiple time zones, and I want to create a new column that will show the date-time-stamp but in the specific time zone of each observation, which is specified by a variable in the same table:
require("lubridate")
require("data.table")
dt <- data.table(A = 1:5, B = rep(ymd_hms("2016-03-24 17:15:12", tz = "UTC"), 5), timezone = c("America/Indiana/Vincennes", "Australia/North", "Pacific/Palau", "Antarctica/Macquarie", "Asia/Nicosia"))
I was trying to accomplish this with the following, but it does not seem to function:
dt[, B_local := with_tz(B, tz = timezone)]
dt
Error in as.POSIXlt.POSIXct(x, tz) : invalid 'tz' value
When I try to add a by
specification in the command, it gets closer to the desired output, but is incorrect and I think somehow is due to non-unique pairs of date-time and timezone like this sample table, ie:
dt[, B_local := with_tz(B, tz = timezone), by = .(B, timezone)]
dt
A B timezone B_local
1: 1 2016-03-24 17:15:12 America/Indiana/Vincennes 2016-03-24 19:15:12
2: 2 2016-03-24 17:15:12 Australia/North 2016-03-24 19:15:12
3: 3 2016-03-24 17:15:12 Pacific/Palau 2016-03-24 19:15:12
4: 4 2016-03-24 17:15:12 Antarctica/Macquarie 2016-03-24 19:15:12
5: 5 2016-03-24 17:15:12 Asia/Nicosia 2016-03-24 19:15:12
Even if I change by = .(A)
in dt[, B_local := with_tz(B, tz = timezone), by = .(A)]
which subsets the table into each row, the output is identical to the above.
NB: I'm more than happy to use something other than lubridate
but I'd prefer to work within data.table
for efficiencies as I have a large dataset.