1

I am following what I read in this previous answer to create my first xts time series. I am getting duplicated rows in the output, but only for some cases. (The data are at 5 min increments, but there are gaps which are not on the regular 5 minute schedule. Therefore, I am using xts for these irregular data, in order to later use acf). This example of the first 10 rows works:

> waterlevels
                 dates water.level.ft
1  2014-12-18 15:43:16             NA
2  2014-12-18 15:48:16          2.608
3  2014-12-18 15:53:16          2.610
4  2014-12-18 15:58:16          2.605
5  2014-12-18 16:03:16          2.600
6  2014-12-18 16:08:16          2.553
7  2014-12-18 16:13:16          2.565
8  2014-12-18 16:18:16          2.352
9  2014-12-18 16:23:16          2.350
10 2014-12-18 16:28:16          2.357

dtw2 <- data.frame(waterlevels$dates, waterlevels$Water.Level.ft)
colnames(dtw2) <- c("dates","waterlevels")
dtw2.ts <- xts(dtw2$waterlevels, order.by = dtw2$dates)

But when I use the full dataset (89246 rows so I am not sure how to post it), it duplicates rows in the output (the data are in EST):

dtw <- data.frame(waterlevels.cw2$dates, waterlevels.cw2$Water.Level.ft)
colnames(dtw) <- c("dates","waterlevels")
dtw.ts <- xts(dtw$waterlevels, order.by=dtw$dates)

> head(dtw.ts)
                 [,1]
2014-12-18 15:43:16    NA  
2014-12-18 15:43:16    NA
2014-12-18 15:48:16 2.608
2014-12-18 15:48:16 2.608
2014-12-18 15:53:16 2.610
2014-12-18 15:53:16 2.610
Warning message:
timezone of object (EST) is different than current timezone (). 

Why would each row be repeated twice in the resulting time series?

Community
  • 1
  • 1
Aditi
  • 13
  • 4
  • It's going to be hard to help you debug this without the actual data file. Can you upload it to dropbox, pastebin, etc? – Joshua Ulrich Sep 24 '15 at 14:49
  • Here is waterlevels.cw2: https://dl.dropboxusercontent.com/u/23038935/waterlevels.cw2.RData This is used in the second snippet of code which produces the duplicates – Aditi Sep 24 '15 at 15:54

1 Answers1

0

Quite simply, the xts object has duplicate rows because your waterlevels.cw2 data.frame does. The first 21,666 rows match the second 21,666 rows.

> wl <- waterlevels.cw2
> all.equal(wl[1:21666,], wl[21667:43332,], check.attributes=FALSE)
[1] TRUE

Remove the duplicates from your data.frame, and they won't be in your xts object:

> dtw <- data.frame(dates=waterlevels.cw2$dates,
+                   waterlevels=waterlevels.cw2$Water.Level.ft)
> head(dtw.ts <- with(dtw[-(1:21666),], xts(waterlevels, dates)))
                     [,1]
2014-12-18 15:43:16    NA
2014-12-18 15:48:16 2.608
2014-12-18 15:53:16 2.610
2014-12-18 15:58:16 2.605
2014-12-18 16:03:16 2.600
2014-12-18 16:08:16 2.553
Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418