0

I'm trying to create a bagging algorithm. for this I need to draw random blocks from the whole time series. I created an index vector that contains the random block draws, but when i want to apply it on my zoo time series, i get the

In zoo(rval, index(x)[i]) :
  some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique

error. because the sampling uses replacement, i get multiple entries with the same timestamp. i somehow tried converting the data into a matrix that doesn't contain timestamps, but it didn't work out really. i also tried the same thing manually in the console, where i didn't get the warning from the zoo object, but i don't really wanna work around suppressing the error message or anything like that.

this would be the part of the code in question: n = sample size, m = blocksize, b = number of blocks (so that m*b=n). Ypretest and Xpretest are initialized as zoo (but i also tried various other types too, didn't work out either)

if (n%%m == 0) {
  b <- n/m

  while(tail(blockvector,n=1)+m < n) {
  blockvector <- c(blockvector,tail(blockvector,n=1)+m)  
}

randomvector <- sample(blockvector, b, replace=T)        

for(i in 1:b) {
  blockindex <- c(blockindex, randomvector[i]:(randomvector[i]+m-1))  
}

Ypretest <- Y[blockindex]
Xpretest <- X[blockindex]

any suggestions?

John Paul
  • 12,196
  • 6
  • 55
  • 75
  • What kind of information does your machine learning algorithmn need, vectors or zoo objects? – Paul Hiemstra Feb 12 '13 at 13:34
  • 1) Is that a warning or an error? 2) Your statement that it worked differently from the console makes no sense. What really is going on? 3) What does "..matrix.. didn't work out really" mean? – Carl Witthoft Feb 12 '13 at 13:52
  • @PaulHiemstra: the machine learning algorithm basically only needs vectors/matrices, but I thought it would be handy if it could handle ts or zoo objects – MichaelJeremias Feb 13 '13 at 08:15
  • @CarlWitthoft: sorry for being unclear: (1) it's a warning, and (2) you're right, it also occurs when I do the exact same things in the console as well. (3) About the matrix: I'm not really familiar with R yet, but I thought if I converted the zoo object into a more basic object to strip it of its timestamp, I could get around this warning – MichaelJeremias Feb 13 '13 at 08:19

1 Answers1

1

This kind of resampling with replacement is not entirely compatible with the timeseries objects in R, where observations with the same timestamp can cause issues. Given that your machine learning algorithms work vectors and matrices, I would just use those. You can drop the zoo objects altogether, or convert them just before bagging.

Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149