10

I have a dataframe like so:

library(dplyr)
data <- data_frame(
  timestamp_utc = c('2015-11-18 03:55:04', '2015-11-18 03:55:08', 
                    '2015-11-18 03:55:10'),
  local_tz = c('America/New_York', 'America/Los_Angeles', 
               'America/Indiana/Indianapolis')
  )

I need to create a new variable that converts the UTC timestamp to the local time as defined in the local_tz column. However, both format and with_tz (from lubridate) expect only one timezone, not a vector of timezones. I'm looking for something like this:

mutate(data, timestamp_local = with_tz(timestamp_utc, tzone = local_tz))

Any ideas?

josiekre
  • 795
  • 1
  • 7
  • 19

5 Answers5

7

First make sure your data is loaded as a date - I had to convert to date first:

data$timestamp_utc <- as.POSIXct(data$timestamp_utc, tz = "UTC")

Then you can use the function rowwise from dplyr, in combination with do:

library(lubridate)
library(dplyr)
z <- data %>% rowwise() %>%
              do(timestamp_local = with_tz(.$timestamp_utc, tzone = .$local_tz))
data$timestamp_local <- z$timestamp_local

data$timestamp_local
[[1]]
[1] "2015-11-17 22:55:04 EST"

[[2]]
[1] "2015-11-17 19:55:08 PST"

[[3]]
[1] "2015-11-17 22:55:10 EST"

We need to make the timestamp_local column a list, as otherwise all the timezones are converted back to one you can only have one timezone in a vector).

jeremycg
  • 24,657
  • 5
  • 63
  • 74
  • 1
    This works! It is slow though. I wonder if `with_tz` could be vectorized. I'll hop over to github and submit it as a feature request. – josiekre Nov 21 '15 at 22:17
  • 1
    Actually this didn't work. It made the offset the same for all three rows. The answer should have a -0500 for NY and Indianapolis and -0800 for LA. – josiekre Nov 22 '15 at 22:07
  • Ugh, it should be fixed now – jeremycg Nov 23 '15 at 15:23
3

Here is one method. With this, the result has to be a string, otherwise unlist() or c() will turn the result back to the system timezone for every element in the list.

It's still slow though because it is not vectorized.

> get_local_time <- function(timestamp_utc, local_tz) {
    l <- lapply(seq(length(timestamp_utc)), 
                function(x) {format(with_tz(timestamp_utc[x], local_tz[x]), "%FT%T%z")})
    unlist(l)
    }

> mutate(data, timestamp_local = get_local_time(timestamp_utc, tzone = local_tz))

Source: local data frame [3 x 3]

        timestamp_utc                     local_tz          timestamp_local
               (time)                        (chr)                    (chr)
1 2015-11-18 03:55:04             America/New_York 2015-11-17T22:55:04-0500
2 2015-11-18 03:55:08          America/Los_Angeles 2015-11-17T19:55:08-0800
3 2015-11-18 03:55:10 America/Indiana/Indianapolis 2015-11-17T22:55:10-0500

Update 2015-11-24

Using dplyr::combine() rather than unlist() allows the variable to remain datetimes with the right timezone attributes rather than converting to strings.

> get_local_time <- function(timestamp_utc, local_tz) {
    l <- lapply(seq(length(timestamp_utc)), 
                function(x) {with_tz(timestamp_utc[x], local_tz[x])})
    combine(l)
    }

> mutate(data, timestamp_local = get_local_time(timestamp_utc, tzone = local_tz))

Source: local data frame [3 x 3]

        timestamp_utc                     local_tz     timestamp_local
               (time)                        (chr)              (time)
1 2015-11-18 03:55:04             America/New_York 2015-11-17T22:55:04
2 2015-11-18 03:55:08          America/Los_Angeles 2015-11-17T19:55:08
3 2015-11-18 03:55:10 America/Indiana/Indianapolis 2015-11-17T22:55:10
josiekre
  • 795
  • 1
  • 7
  • 19
  • I was not able to run the example successfullyt with the data specified above: `Error: no applicable method for 'reclass_date' applied to an object of class "character"` – MartinT Apr 14 '16 at 00:54
3

The trick is to use group_by() with local_tz before mutate():

data$timestamp_utc <- as.POSIXct(data$timestamp_utc, tz = "UTC")

data %>% 
  group_by(local_tz) %>%
  mutate(timestamp_local = with_tz(timestamp_utc, local_tz))
Claudia
  • 996
  • 1
  • 10
  • 27
0

A data.table option worked well for me:

data[, timestamp_local := with_tz(timestamp_utc, local_tz), by=local_tz]
Tiffany
  • 301
  • 1
  • 2
  • 12
-1

One can vectorize time zone conversion as follows

library(dplyr)
library(lubridate)

with_tz_utc <- function(ts, tz) force_tz(with_tz(ts, tz), 'UTC')
as_datetime_with_tz_utc <- compose(as_datetime, Vectorize(with_tz_utc))

Now use mutate as usual

data %>%
  mutate(
    timestamp_utc = as_datetime(timestamp_utc),
    timestamp_local = as_datetime_with_tz_utc(timestamp_utc, local_tz)
  )

As an alternative - which is much slower - one can use function rowwise with mutate and ungroup (reverting rowwise) like this

data %>%
  rowwise() %>%
  mutate(
    timestamp_utc = as_datetime(timestamp_utc),
    timestamp_local = with_tz_utc(timestamp_utc, local_tz)
  ) %>%
  ungroup()
spren9er
  • 744
  • 8
  • 11