You really appear to have two separate problems here.
Problem 1: Given a matrix index
, for each row i
and column j
you want to set test[i,j]
to 2 if j
appears in row i
of index
. This can be done with simple matrix indexing, passing a 2-column matrix of indices where the first column is the rows of all the elements you want to index and the second column is the columns of all the elements you want to index:
test[cbind(as.vector(row(index)), as.vector(index))] <- 2
test
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 2 2 0 2 2 2 2 0 2 2
# [2,] 2 0 2 2 2 2 2 0 2 2
# [3,] 2 2 2 2 0 0 2 2 0 0
# [4,] 2 2 0 0 0 2 2 2 0 2
# [5,] 2 2 2 2 0 0 0 0 2 0
# [6,] 0 0 0 0 0 2 2 2 2 0
# [7,] 2 0 2 2 2 2 2 0 0 0
# [8,] 2 0 2 2 2 2 0 2 0 2
# [9,] 2 2 2 2 0 0 2 0 2 2
# [10,] 2 0 2 0 0 2 2 2 2 0
Since this does all the operations in a single vectorized operation, it should be faster than looping through the rows and handling them individually. Here's an example with 1 million rows and 10 columns:
OP <- function(test, index) {
for (i in 1:nrow(test)){
test[i,index[i,]] <- 2
}
test
}
josliber <- function(test, index) {
test[cbind(as.vector(row(index)), as.vector(index))] <- 2
test
}
test.big <- matrix(0, nrow = 1000000, ncol = 10)
set.seed(1234)
index.big <- matrix(sample.int(10, 1000000*10, TRUE), 1000000, 10)
identical(OP(test.big, index.big), josliber(test.big, index.big))
# [1] TRUE
system.time(OP(test.big, index.big))
# user system elapsed
# 1.564 0.014 1.591
system.time(josliber(test.big, index.big))
# user system elapsed
# 0.408 0.034 0.444
Here, the vectorized approach is 3.5x faster.
Problem 2: You want to set row i
of test
to order
applied to the corresponding row of anyMatrix
. You can do this with apply
:
(test <- t(apply(anyMatrix, 1, order)))
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 1 10 7 8 4 5 3 6 2 9
# [2,] 8 7 1 6 3 4 9 5 10 2
# [3,] 4 9 7 1 3 2 6 10 5 8
# [4,] 1 2 6 4 10 3 9 8 7 5
# [5,] 9 6 5 1 2 7 10 4 8 3
# [6,] 9 3 8 6 5 10 1 4 7 2
# [7,] 3 7 2 5 6 8 9 4 1 10
# [8,] 9 8 1 3 4 6 7 10 5 2
# [9,] 8 4 3 6 10 7 9 5 2 1
# [10,] 4 1 9 3 6 7 8 2 10 5
I wouldn't expect much of a change in runtime here, because apply
is really just looping through the rows similarly to how you were looping in your solution. Still, I would prefer this solution because it's a good deal less typing and the more "R" way of doing things.
Note that both of these applications used pretty different code, which is pretty typical in R data manipulation -- there are a lot of different specialized operators and you need to pick the one that's right for your task. I don't think there's a single function or even really a small set of functions that are going to be able to handle all matrix manipulations where that manipulation is based on data from another matrix.