0

I have a dataframe looks like this:

name       strand

thrL       1

thrA       1

thrB       1

yaaA       -1

yaaJ       -1

talB       1

mog        1

I would like to group first few positive values into a group, negative values a group and next postive numbers as another group which look like this:

name       strand     directon

thrL       1           1

thrA       1           1

thrB       1           1

yaaA       -1          2

yaaJ       -1          2

talB       1           3

mog        1           3

I am thinking to use dplyr but I need some help with the code using R. Thank you so much.

Limey
  • 10,234
  • 2
  • 12
  • 32
  • See [Create group number for contiguous runs of equal values](https://stackoverflow.com/questions/30314679/create-group-number-for-contiguous-runs-of-equal-values) – Henrik Mar 18 '21 at 11:51

2 Answers2

0

Using rle :

df$direction <- with(rle(sign(df$strand)), rep(seq_along(values), lengths))
df

#  name strand direction
#1 thrL      1         1
#2 thrA      1         1
#3 thrB      1         1
#4 yaaA     -1         2
#5 yaaJ     -1         2
#6 talB      1         3
#7  mog      1         3

This can be made shorter with data.table rleid.

df$direction <- data.table::rleid(sign(df$strand))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

We can also do this as

df1$direction <- inverse.rle(within.list(rle(sign(df1$strand)),
         values <- seq_along(values)))
df1$direction
#[1] 1 1 1 2 2 3 3

data

df1 <- structure(list(name = c("thrL", "thrA", "thrB", "yaaA", "yaaJ", 
"talB", "mog"), strand = c(1L, 1L, 1L, -1L, -1L, 1L, 1L)), 
class = "data.frame", row.names = c(NA, 
-7L))
akrun
  • 874,273
  • 37
  • 540
  • 662