0

I want to do a lot of matrix indexing of a high-D array, but the indices are split up. I came up with a few solutions:

### setup
test <- array(0, c(3,3,3,3))
test[1,2,3,2] <- 1
system.time(for (i in 1:1000000) test[1,2,3,2] )
### index split between two vectors
idx1 <- c(1,2);     idx2 <- c(3,2)
### things that work are slower
system.time(for (i in 1:1000000) test[rbind(c(idx1, idx2))] )
system.time(for (i in 1:1000000) test[matrix(c(idx1, idx2), nrow=1)] )
system.time(for (i in 1:1000000) test[t(c(idx1, idx2))] )

But the fastest, rbind(c(X)), takes twice as long as indexing directly. Is there any faster way? Is there anything like python's *args that I could run on '['?

enfascination
  • 1,006
  • 9
  • 20
  • 2
    How are your index vectors split up? Can you provide a sample with more than one index. This operation should be entirely vectorised. The slowness you are seeing is because you are using a `for` loop to do something for which you should not use a `for` loop. – Simon O'Hanlon Nov 06 '13 at 11:31
  • I hope you are right that I don't need a for loop, but I'm not convinced. Can you vectorize this? ### populating distribution testa with observations testi testi <- matrix(sample(c(1,2,3), 400000, repl=T), ncol=4) testa1 <- array(0, c(3,3,3,3)) testa2 <- array(0, c(3,3,3,3)) for (i in 1:nrow(testi)) { testa1[rbind(testi[i,])] <- testa1[rbind(testi[i,])] + 1 } ### for loop solution testa2[testi] <- testa2[testi] + 1 ### vectorized solution? Its faster, but way broken. What would you do? all(testa1 == testa2) sum(testa1); sum(testa2) – enfascination Nov 06 '13 at 12:19

1 Answers1

1

A bit cumbersome, but try

test[idx1[1], idx1[2], idx2[1], idx2[2]]
Hong Ooi
  • 56,353
  • 13
  • 134
  • 187