3

I am struggling to understand what exactly happens behind the scenes when passing POSIXct objects between R and Postgres using RPostgreSQL. In the following example, I define two timestamp fields: one with a timezone the other one without. However, it appears that they are treated exactly the same when passing POSIXct objects via dbWriteTable and dbReadTable.

library(RPostgreSQL)

drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, host = "127.0.0.1", port = "5432", user= "postgres",
                 dbname = "test_db")

q <- "
CREATE TABLE test_table
(
  dttm timestamp without time zone,
  dttmtz timestamp with time zone
)"
dbSendQuery(con, q)

# using timezone CET
dttm <- as.POSIXct("2016-01-01 10:20:10", tz="CET")
df <- data.frame(dttm = dttm, dttmtz = dttm)
dbWriteTable(con, "test_table", df, overwrite=FALSE, append=T, row.names=0)

# using timezone UTC    
dttm <- as.POSIXct("2016-01-01 14:20:10", tz="UTC")
df <- data.frame(dttm = dttm, dttmtz = dttm)
dbWriteTable(con, "test_table", df, overwrite=FALSE, append=T, row.names=0)

df2 <- dbReadTable(con, "test_table")

Both fields come out exactly the same. It appears as if the timezones are completely discarded.

df2$dttm
[1] "2016-01-01 10:20:10 CET" "2016-01-01 14:20:10 CET"

df2$dttmtz
"2016-01-01 10:20:10 CET" "2016-01-01 14:20:10 CET"

QUESTIONS:

  1. What exactly goes on behind the scenes?
  2. How can I properly pass the POSIXct's timezone back and forth?
Jaap
  • 81,064
  • 34
  • 182
  • 193
Mark Heckmann
  • 10,943
  • 4
  • 56
  • 88

2 Answers2

2

I think you've pointed out a bug in RPostgreSQL: it does not seem to be getting time zone from R for POSIXct objects. Timezone information can be passed correctly to PostgreSQL by formatting timestamps as character with offset from UTC (see example at bottom of this answer; added 2018-09-21). But first, here's an illustration of the apparent bug:

Modifying your code:

library(RPostgreSQL)

drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv,  port = "5432", user= "postgres",
                 dbname = "test")

# timestamps in three different time zones
dt1 <- as.POSIXct("2016-01-01 10:20:10", tz="US/Eastern")
dt2 <- as.POSIXct("2016-01-01 10:20:10", tz="UTC")
dt3 <- as.POSIXct("2016-01-01 10:20:10", tz="Asia/Tokyo")
df <- data.frame(dt1=dt1, dt2=dt2, dt3=dt3)

q <- "
CREATE TABLE test_table
(
  dt1 timestamp with time zone,
  dt2 timestamp with time zone,
  dt3 timestamp with time zone,
  PRIMARY KEY (dt1)
)"
dbSendQuery(con, q)

dbWriteTable(con, "test_table", df, overwrite=FALSE, append=T, row.names=0)

df2 <- dbReadTable(con, "test_table")

note that all three timestamps are equal timezones not handled correctly

df2$dt1

"2016-01-01 10:20:10 EST"

df2$dt2

"2016-01-01 10:20:10 EST"

df2$dt3

"2016-01-01 10:20:10 EST"

And same is true in postgres - as seen in pgadmin here enter image description here

This suggests postgres is not getting timezone from R

Note that if we manually change one time zone in test_table (e.g., first record in pgadmin)

eg, enter image description here

and fetch

df2 <- dbReadTable(con, "test_table")

then the timezone is correctly handled

df2$dt1

"2016-01-01 05:20:10 EST"

df2$dt2

"2016-01-01 10:20:10 EST"

df2$dt3

"2016-01-01 10:20:10 EST"

So this suggets that RPostgreSQL is not correctly passing time zone information to postgres but that RPostgreSQL is correctly getting time zone information from postgres.


answer to the original question

To pass a timestamp with timezone from R to Postgres using RPostgreSQL, just format it as a character string with the offset from UTC (e.g., "2016-01-01 10:20:10-0500"; e.g., use format and then pass it to Postgres, same as above.

E.g.:

#convert POSIXct to character with offset from UTC
df$dt1 <- format(df$dt1, format = "%Y-%m-%d %H:%M:%OS%z")
df$dt2 <- format(df$dt2, format = "%Y-%m-%d %H:%M:%OS%z")
df$dt3 <- format(df$dt3, format = "%Y-%m-%d %H:%M:%OS%z")

##> df
##                       dt1                      dt2                      dt3
##1 2016-01-01 10:20:10-0500 2016-01-01 10:20:10+0000 2016-01-01 10:20:10+0900

q <- "
CREATE TABLE test_table2
(
  dt1 timestamp with time zone,
  dt2 timestamp with time zone,
  dt3 timestamp with time zone,
  PRIMARY KEY (dt1)
)"
dbSendQuery(con, q)

dbWriteTable(con, "test_table2", df, overwrite=FALSE, append=T, row.names=0)

df3 <- dbReadTable(con, "test_table2")

#Note that times are now correct (in local time zone)
##> df3$dt1
##[1] "2016-01-01 10:20:10 EST"
##> df3$dt2
##[1] "2016-01-01 05:20:10 EST"
##> df3$dt3
##[1] "2015-12-31 20:20:10 EST"
Chris Holbrook
  • 2,531
  • 1
  • 17
  • 30
  • revisiting this today, I noticed that the question (how to correctly pass timestamps with timezone to postgres) was not answered, so I added it (e.g., pass as character string with offset from UTC). – Chris Holbrook Sep 22 '18 at 00:49
0

First, let me say that I am not a fan of the way R handles this. Let's take a value of time in UTC:

dttm.utc <- as.POSIXct("2016-01-01 10:20:10", tz="UTC")
dttm.utc
[1] "2016-01-01 10:20:10 UTC"

Now we can convert it to CET timezone relatively easily:

dttm.cet <- format( dttm.utc, tz = "CET", usetz = T )
dttm.cet
[1] "2016-01-01 11:20:10 CET"

But check this out, each of those values is of a different class. The first is in a POSIX format, but the second has been converted to character class by the format function.

class( dttm.utc )
[1] "POSIXct" "POSIXt" 

class( dttm.cet )
[1] "character"

That's no good, because it means we can't just do the same conversion again in the other direction, we need to first convert the latter value to POSIX class first, being very careful not to let R muck around with the timezone:

dttm.cet <- as.POSIXct( dttm.cet, tz = "CET" )
dttm.cet
[1] "2016-01-01 11:20:10 CET"

class( dttm.cet )
[1] "POSIXct" "POSIXt" 

Now we can convert it:

format( dttm.cet, tz = "UTC", usetz = TRUE )
[1] "2016-01-01 10:20:10 UTC"

But that puts us back to character class. Very annoying. Here's a workaround. Build the two-step conversion into a function, and use that from now on.

convert.tz <- function( x, tz ) {
    new <- format( x, tz = tz, usetz = T )
    return( as.POSIXct( new, tz = tz ) )
}

Give that a try:

dttm.utc <- as.POSIXct("2016-01-01 10:20:10", tz="UTC")
dttm.utc
[1] "2016-01-01 10:20:10 UTC"

dttm.cet <- convert.tz( dttm.utc, "CET" )
dttm.cet
[1] "2016-01-01 11:20:10 CET"

class( dttm.utc )
[1] "POSIXct" "POSIXt" 

class( dttm.cet )
[1] "POSIXct" "POSIXt" 

So now the conversion doesn't change the format, which means we can go either way in the conversion, without changing the method:

convert.tz( dttm.cet, "UTC" )
[1] "2016-01-01 10:20:10 UTC"

Ahhh. Much better.

Of course, you could stick with base R, and do this every time you convert.

dttm.cet <- as.POSIXct( format( dttm.utc, tz = "CET", usetz = T ), tz = "CET" )

But personally, I like the function a lot better.

rosscova
  • 5,430
  • 1
  • 22
  • 35
  • Thank rosscova. However, I am not sure if you address my question which concerns passing datetimes between R and Postgres, not just changing the timezone of a POSIXct object. For the latter there are functions available, e.g. `lubridate::with_tz(dttm.utc, "CET")`. So no need to write another wrapper. – Mark Heckmann Nov 10 '16 at 10:33
  • Right, sorry. I figured the issue you're having was with the R conversion itself. Doesn't solving the R tz conversion problem solve the rest of the issue? And thanks for the `lubridate` solution, I'll need to get more familiar with it methinks :) – rosscova Nov 10 '16 at 10:38