5

I have a matrix:

mat <- matrix(c(2,11,3,1,2,4,55,65,12,4,6,6,7,9,3,23,16,77,5,5,7),ncol = 3, byrow = TRUE)

     [,1] [,2] [,3]
[1,]    2   11    3
[2,]    1    2    4
[3,]   55   65   12
[4,]    4    6    6
[5,]    7    9    3
[6,]   23   16   77
[7,]    5    5    7

I want to add a column with rows index. This index will starts at 1 and repeats the same index, until it arrived to a row where the rowsums is > 100 to move to the next value.

  Indx[,2][,3][,4]
[1,] 1  2 11  3
[2,] 1  1  2  4
[3,] 2 55 65 12
[4,] 3  4  6  6
[5,] 3  7  9  3
[6,] 4 23 16 77
[7,] 5  5  5  7
smci
  • 32,567
  • 20
  • 113
  • 146
  • Ah, you actually want to increment both on the row where rowSum > 100, **and** the following row. Otherwise, you would not increment on rows `[4,] 4 6 6` or `[7,] 5 5 7` – smci Aug 16 '18 at 20:42

4 Answers4

8

Using rle:

matRle <- rle(rowSums(mat) > 100)$lengths

cbind(rep(seq(length(matRle)), matRle), mat)
#      [,1] [,2] [,3] [,4]
# [1,]    1    2   11    3
# [2,]    1    1    2    4
# [3,]    2   55   65   12
# [4,]    3    4    6    6
# [5,]    3    7    9    3
# [6,]    4   23   16   77
# [7,]    5    5    5    7
zx8754
  • 52,746
  • 12
  • 114
  • 209
4

A solution using dplyr.

library(dplyr)

mat2 <- mat %>%
  as.data.frame() %>%
  mutate(Indx = cumsum(rowSums(dat) > 100 | lag(rowSums(dat) > 100, default = TRUE))) %>%
  select(Indx, paste0("V", 1:ncol(mat))) %>%
  as.matrix()
mat2
#      Indx V1 V2 V3
# [1,]    1  2 11  3
# [2,]    1  1  2  4
# [3,]    2 55 65 12
# [4,]    3  4  6  6
# [5,]    3  7  9  3
# [6,]    4 23 16 77
# [7,]    5  5  5  7
www
  • 38,575
  • 12
  • 48
  • 84
4
 cbind(cumsum(replace(a<-rowSums(mat)>100,which(a==1)+1,1))+1,mat)
     [,1] [,2] [,3] [,4]
[1,]    1    2   11    3
[2,]    1    1    2    4
[3,]    2   55   65   12
[4,]    3    4    6    6
[5,]    3    7    9    3
[6,]    4   23   16   77
[7,]    5    5    5    7

What does this do??:

first obtain the rowSums which are greater than 100

a<-rowSums(mat)>100

Then the next row for every row>100, should have the next index. Thus do a replace and cumsum:

cumsum(replace(a,which(a==1)+1,1))

Now you will realize that this starts from zero, so you add 1.

Onyambu
  • 67,392
  • 3
  • 24
  • 53
3

We could do this with rleid from data.table

library(data.table)
cbind(Indx =  rleid(rowSums(mat) > 100), mat)
#     Indx         
#[1,]    1  2 11  3
#[2,]    1  1  2  4
#[3,]    2 55 65 12
#[4,]    3  4  6  6
#[5,]    3  7  9  3
#[6,]    4 23 16 77
#[7,]    5  5  5  7
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Wasn't aware data.table had [`rleid`](https://www.rdocumentation.org/packages/data.table/versions/1.11.4/topics/rleid)! There's a link to its doc. – smci Aug 16 '18 at 21:04
  • Also: [Is there a dplyr equivalent to data.table::rleid?](https://stackoverflow.com/questions/33507868/is-there-a-dplyr-equivalent-to-data-tablerleid) – smci Aug 16 '18 at 21:05
  • @smci That looks like a possible dupe? – akrun Aug 16 '18 at 21:11
  • Related but not exact dupe. Per my comment above OP actually wants to increment both on the row where rowSum > 100, **and** the following row. Also, this questions allows `base, dplyr, purrr, data.table` et al. So it's valuable. – smci Aug 16 '18 at 21:13