1

I have a data frame of time series:

  X1.HK.Equity      X X2.HK.Equity    X.2 X3.HK.Equity   X.4
1   31/12/2002 38.855   31/12/2002 19.547   31/12/2002 5.011
2   02/01/2003 38.664   02/01/2003 19.547   02/01/2003 4.986
3   03/01/2003 40.386   03/01/2003 19.547   03/01/2003 4.962
4   06/01/2003 40.386   06/01/2003 19.609   06/01/2003 4.937
5   07/01/2003 40.195   07/01/2003 19.609   07/01/2003 4.937
6   08/01/2003 40.386   08/01/2003 19.547   08/01/2003 4.912

I want to take this time series and change it to a list of 3 items, each being an XTS created from columns 1-2, 3-4 and 5-6. Note that the time series do not necessarily have the same dates.

I would be extra happy if someone could show me how to do this with using the plyr library.

dput of my data frame:

structure(list(X1.HK.Equity = c("31/12/2002", "02/01/2003", "03/01/2003", 
"06/01/2003", "07/01/2003", "08/01/2003"), X = c(38.855, 38.664, 
40.386, 40.386, 40.195, 40.386), X2.HK.Equity = c("31/12/2002", 
"02/01/2003", "03/01/2003", "06/01/2003", "07/01/2003", "08/01/2003"
), X.2 = c(19.547, 19.547, 19.547, 19.609, 19.609, 19.547), X3.HK.Equity = c("31/12/2002", 
"02/01/2003", "03/01/2003", "06/01/2003", "07/01/2003", "08/01/2003"
), X.4 = c(5.011, 4.986, 4.962, 4.937, 4.937, 4.912)), .Names = c("X1.HK.Equity", 
"X", "X2.HK.Equity", "X.2", "X3.HK.Equity", "X.4"), row.names = c(NA, 
6L), class = "data.frame")
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
mchangun
  • 9,814
  • 18
  • 71
  • 101

3 Answers3

2

An ordinary lapply would work here:

to.xts <- function(i) as.xts(read.zoo(DF[i+0:1], format = "%d/%m/%Y"))
lapply(seq(1, ncol(DF), 2), to.xts)

If this is not just an example and, in fact, there are only three series it would be sufficient to replace the last line with:

list(to.xts(1), to.xts(3), to.xts(5))
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
1

I don't know if this is the most efficient way to do this, but hopefully, it is fairly logical.

library(xts)
lapply(seq_along(mydf[c(FALSE, TRUE)]), function(x) {
  xts(mydf[c(FALSE, TRUE)][x], 
      order.by=as.Date(mydf[c(TRUE, FALSE)][, x], format = "%d/%m/%Y"))
})
# [[1]]
#                 X
# 2002-12-31 38.855
# 2003-01-02 38.664
# 2003-01-03 40.386
# 2003-01-06 40.386
# 2003-01-07 40.195
# 2003-01-08 40.386
# 
# [[2]]
#               X.2
# 2002-12-31 19.547
# 2003-01-02 19.547
# 2003-01-03 19.547
# 2003-01-06 19.609
# 2003-01-07 19.609
# 2003-01-08 19.547
# 
# [[3]]
#              X.4
# 2002-12-31 5.011
# 2003-01-02 4.986
# 2003-01-03 4.962
# 2003-01-06 4.937
# 2003-01-07 4.937
# 2003-01-08 4.912

Basically, it uses recycling of TRUE and FALSE to select alternate columns. The seq_along part tells us how many pairs we have (in this example, 3), and in the anonymous function, we subset the values with c(FALSE, TRUE), and the dates with c(TRUE, FALSE).

A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
  • If your data are large, you might do better creating subsets of `mydf` outside of your `lapply` function, since (I think) this is subsetting within each "loop" that `lapply` makes. – A5C1D2H2I1M1N2O1R2T1 Apr 07 '13 at 08:39
1

With plyr, using llply

tss = llply(c(1,3,5),function(s){ts=mydf[,s:(s+1)];xts(ts[,2],order.by=as.Date(ts[,1],format="%d/%m/%Y"))} )

gives you

> tss[[1]]
             [,1]
2002-12-31 38.855
2003-01-02 38.664
2003-01-03 40.386
2003-01-06 40.386
2003-01-07 40.195
2003-01-08 40.386
> class(tss[[1]])
[1] "xts" "zoo"
> 

That 1,3,5 vector is more generally seq(1,ncol(mydf),by=2)

Spacedman
  • 92,590
  • 12
  • 140
  • 224