0

For some reason when I convert my time series data from a data frame to an xts object, the timezone is included in the index. I suspect this is what the issue is when I try to run time series modelling on the object because I keep getting errors. When I go to check the structure of the xts object, the data inside the xts object are somehow converted to chr. They should be num which is what they originally were before the conversion. Here are some data:

full_timestamp             PRICE
2015-01-02 10:02:27.389055  85.4
2015-01-02 10:03:30.926059  85.3
2015-01-02 10:04:52.231750  85.4
2015-01-02 10:05:37.139763  85.5
2015-01-02 10:06:54.926069  85.5
2015-01-02 10:07:57.253187  85.3

Here is the structure of the dataframe:

Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   89026 obs. of  2 variables:
 $ full_timestamp: POSIXct, format: "2015-01-02 10:02:27.389055" "2015-01-02 10:03:30.926059" "2015-01-02 10:04:52.231750" ...
 $ PRICE         : num  85.4 85.3 85.4 85.5 85.5 ...

The code that I used to convert the timestamp from a character vector to a POSIXct timestamp:

testing_eq_4xts$full_timestamp <- as.POSIXct(strptime(testing_eq_4xts$full_timestamp, 
                                                      format = "%Y-%m-%d:%H:%M:%OS",
                                                      tz = ""))

I have tried to include tz = "", not include the tz part at all, and even Sys.unsetenv("TZ") to stop the conversion picking up the timezone. I should also stress that I need the granularity in the timestamp for what I am modelling. Here is the code I use to convert to xts:

testing_eq_xts <- as.xts(testing_eq_4xts[, names(testing_eq_4xts) != "full_timestamp"],
                  order.by = testing_eq_4xts$full_timestamp, unique = F)

and this is what the structure looks like:

An ‘xts’ object on 2015-01-02 10:02:27.389055/2015-12-31 14:37:07.969814 containing:
  Data: num [1:89026, 1] 85.4 85.3 85.4 85.5 85.5 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr "PRICE"
  Indexed by objects of class: [POSIXct,POSIXt] TZ: 
  xts Attributes:  
 NULL

also the index of the xts:

[1] "2015-01-02 10:02:27.389055 AEDT" "2015-01-02 10:03:30.926059 AEDT"
[3] "2015-01-02 10:04:52.231750 AEDT" "2015-01-02 10:05:37.139763 AEDT"

Leaving the timestamp as a character vector and trying to convert it to POSIXct when converting to xts also does not work, using:

testing_eq_xts2 = xts(testing_eq_4xts[, 2], as.POSIXct(testing_eq_4xts[, 1]))

gives this error:

Error in as.POSIXct.default(testing_eq_4xts[, 1]) : 
  do not know how to convert 'testing_eq_4xts[, 1]' to class “POSIXct”

Am I on the right track as to why PRICE has been converted to chr in the xts object, and if so, how do I fix it? If I am incorrect in my assumptions, then what is it that I need to do to fix this? Thanks.

jem888
  • 1
  • 2

1 Answers1

0

You have already figured out that the 'tz' value is sometimes "", but if you look at ?strptime you will see that using tz="" is really saying to use your current timezone which will get pulled from your system settings. You should instead use tz="GMT" (assuming that UTC/UCT is what you mean when you say "no time zone".) There's really no way to have "no time zone", since time is always measured in some time zone or another.

AND ... you cannot have a POSIXct column in an xts (or zoo) object (and probably not a character value either, since you would generally expect an xts object to hold numeric values). That is because xts and zoo objects are actually R matrices and they cannot hold columns with attributes. Since POSIXct objects are actually doubles (seconds since origin) with a class attribute of 'POSIXct', they cannot be elements in the xts or zoo matrices. ( They can be xts INDEXes, because there are different rules for that part of the structure.)

 myxts <- as.xts(read.csv.zoo(text="full_timestamp,             PRICE
 2015-01-02 10:02:27.389055,  85.4
 2015-01-02 10:03:30.926059,  85.3
 2015-01-02 10:04:52.231750,  85.4
 2015-01-02 10:05:37.139763,  85.5
 2015-01-02 10:06:54.926069,  85.5
 2015-01-02 10:07:57.253187,  85.3", sep=",", index=1, tz="GMT"))
 myxts
#-----------------
                    [,1]
2015-01-02 10:02:27 85.4
2015-01-02 10:03:30 85.3
2015-01-02 10:04:52 85.4
2015-01-02 10:05:37 85.5
2015-01-02 10:06:54 85.5
2015-01-02 10:07:57 85.3
Warning message:
timezone of object (GMT) is different than current timezone (). 

 dput(myxts)
#-------------
structure(c(85.4, 85.3, 85.4, 85.5, 85.5, 85.3), .Dim = c(6L, 
1L), index = structure(c(1420192947.38906, 1420193010.92606, 
1420193092.23175, 1420193137.13976, 1420193214.92607, 1420193277.25319
), tzone = "GMT", tclass = c("POSIXct", "POSIXt")), class = c("xts", 
"zoo"))

 index(myxts)
 #--------------
[1] "2015-01-02 10:02:27 GMT" "2015-01-02 10:03:30 GMT" "2015-01-02 10:04:52 GMT"
[4] "2015-01-02 10:05:37 GMT" "2015-01-02 10:06:54 GMT" "2015-01-02 10:07:57 GMT"
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Setting the timezone to GMT didn't change anything except the index to include GMT instead of AEDT. Just out of curiosity, what does `str(myxts)` look like for you? I'm beginning to get really confused as to what is going. Cheers. – jem888 Feb 19 '20 at 03:33
  • I gave you the dput output. That should give you everything you need. It should be better than str but if you really need str, then just make an myxts using that value and run str. Your object was NOT an xts object. It was a "tibble". The index of my dput object does yield a POSIXct object with the TZ of GMT. I showed how I made it. Don't you get the same result with my full code? – IRTFM Feb 19 '20 at 04:39