5

I have a set of standard unix integer timestamps, all in UTC (GMT), that I'm feeding into R that I wish to plot. I've been using code of the form:

d$date_time <- as.POSIXct(d$date_time,origin="1970-01-01",tz="GMT")

to covert my column of standard unix timestamp UTC integers into what I assume is some kind of plottable set of objects. I can plot this and the data looks approximately good, but I have no idea whether all of my data is being offset in any way by the local timezone of my computer or any other timezone adjustments. This is because I don't understand what adjustments get made to the data (if any) when a) I make the call to as.POSIXct() and b) when I plot the data. So these are my questions:

  1. When I specify tz="GMT" above, what exactly is this telling the computer to do? I see three possibilities: i) "your data is in GMT and you want it converted to your local time" ii) "your data is always assumed to be local time and you want it converted to GMT" iii) "your data is always assumed to be GMT and you want me to leave it in GMT, so don't make any adjustments".
  2. When I plot the data (with xyplot), does the plotting function make any visual adjustments to the time? If so, what adjustments?

I think if someone could explain how the internal data structures store timezone information as well as how those data structures are transformed by various commands it would help clear things up. Basically, I would like to work with UTC from the beginning right up to the point of display, where I might wish to make adjustments for timezones, though ideally explicitly rather than the computer silently deciding for me.

Bryce Thomas
  • 10,479
  • 26
  • 77
  • 126
  • AFAIK, the data is always stored in universal time, and simply displayed in local time when it gets to printing. I'll try to dig out a related question. – Andrie Dec 13 '12 at 12:52

1 Answers1

6

R's Date and Time classes are very powerful, but they take some getting used to. TZ adjustments in particular are tricky, but they are related the construction and conversion particular to/from character.

Consider the following example which limits itself to numeric for input. We have fine control:

R> tt <- as.POSIXct(0, origin="1970-01-01")
R> str(tt)
 POSIXct[1:1], format: "1970-01-01"
R> tt
[1] "1970-01-01 CST"
R> 
R> tt <- as.POSIXct(600, origin="1970-01-01")
R> tt
[1] "1970-01-01 00:10:00 CST"
R> 
tt <- as.POSIXct(600, origin="1970-01-01", tz="UTC")
R> tt
[1] "1970-01-01 00:10:00 UTC"
R> 
R> as.numeric(tt)
[1] 600
R> 

You get the whole date arithmetic, conversion, difftime() etc and can still pass the pure numeric values on. Also:

R> tt <- as.POSIXct(600, origin="1970-01-01", tz="UTC")
R> tt
[1] "1970-01-01 00:10:00 UTC"
R> tt2 <- tt + 1.234567
R> tt2
[1] "1970-01-01 00:10:01.234566 UTC"
R> 

You can use attributes() to check if a TZ has been set:

R> attributes(tt)
$tzone
[1] "UTC"

$class
[1] "POSIXct" "POSIXt" 

R> 

So if you are careful about creation and conversion you can indeed have a full toolkit around UTC-based time data.

Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
  • So regarding `2.` in the original question, assuming everything has been kept in UTC internally, do the plotting functions like `xyplot()` typically plot in UTC, or make adjustments for the local time zone (e.g. where I am, if it makes adjustments, all timestamps would be visually shifted +10 hours)? – Bryce Thomas Dec 14 '12 at 03:04
  • 1
    You will have to test that. Note that you can override your (default, and system) time zone values for the R session too via `Sys.setenv()`. That said, you could also force x-axis values by going to as.numeric(timevariable) and plotting against that. – Dirk Eddelbuettel Dec 14 '12 at 03:10