0

I have a matrix with many rows. Let's say

M = matrix(1:20, nrow = 4, ncol = 5)

And I have a threshold variable, e.g.

threshold = c(4,7,11,14,17)

Now I want R to compare each row in the matrix with this threshold value by value and tell me whether at least one value in that row exceeds this threshold's corresponding value. I.e., M[1,1] should be compared with threshold [1], M[1,2] with threshold [2] etc.

Ideally I would like to have a new variable, let's call it check, with just 1/FALSE (there is at least 1 value in the row exceeding the threshold) or 0/TRUE (no such values). Till now, what I could program is this:

check = apply (M, MARGIN=1, (ifelse((M[,] < threshold), 1, 0)))

check = apply (check, MARGIN=1, sum)

check = check == 0

But there are 3 problems with it:

  1. Maybe it's not the best way to solve the problem? I have a lot of data, and I think it can work very slowly..
  2. It doesn't work, R says:

    check = apply (M, MARGIN=1, (ifelse((M[,] < threshold), 1, 0))) Error in match.fun(FUN) : '(ifelse((M[, ] < threshold), 1, 0))' is not a function, letter or symbol

  3. Even if I perform just

    ifelse((M < threshold), 1, 0)

for the first row I get

[1,]    1    1    1    0    0

Which is not true, because there are no values in the first row that exceed the threshold. It seems that R just compares the whole first row with the 1st element of threshold, then the whole 2nd row with the second value etc., and that's not what I want...

Many thanks in advance!

Sotos
  • 51,121
  • 6
  • 32
  • 66
Alex M
  • 129
  • 1
  • 7
  • I think, for (3), you are getting the result because the first row of M looks like `1 2 3 4 5` and neither 4 nor 5 are < the lowest value in threshold `4` – jessi Oct 23 '19 at 12:19
  • But matrix is created column by column, so the first row is (1,5,9,13,17), not (1,2,3,4,5) ... [,1] [,2] [,3] [,4] [,5] [1,] 1 5 9 13 17 [2,] 2 6 10 14 18 [3,] 3 7 11 15 19 [4,] 4 8 12 16 20 – Alex M Oct 23 '19 at 13:09

2 Answers2

1

You can try,

rowSums(t(M) > threshold) >= 1
#[1] FALSE  TRUE  TRUE  TRUE  TRUE

To see it row-by-row just do,

t((t(M) > threshold)*1) #---> ...* 1 just converts from logical to integer

#     [,1] [,2] [,3] [,4] [,5]
#[1,]    0    0    0    0    0
#[2,]    0    0    0    0    1
#[3,]    0    0    0    1    1
#[4,]    0    1    1    1    1

Based on your comment,

as.integer(rowSums(t((t(M) > threshold) * 1) > 0) > 0)
#[1] 0 1 1 1
Sotos
  • 51,121
  • 6
  • 32
  • 66
  • Sorry, but I want it to be compared row by row, i.e., I should get 4 values at the end (one for each row). In your example, I get 5 values, because it compares column by column – Alex M Oct 23 '19 at 13:45
  • I don't understand what you mean. You also have 5 values for the first row... `1 1 1 0 0` – Sotos Oct 23 '19 at 13:48
  • @AlexM please have a look now. The calculations were correct. There was just another step to get the logical (or binary) as to which row has any value greater than its threshold. – Sotos Oct 23 '19 at 14:08
1
apply(M, 1, function(x) max(diag(sapply(x, function(y) y >threshold))))
pk_22
  • 288
  • 1
  • 2
  • 18