2

I have a dataframe where I want to extract start and end of positions from column 2 where values in column 4 equals to 0. I have thousands of rows in this dataframe.

C1  C2  C3  C4
R1  1   val 182
R1  2   val 22
R1  3   val 45
R1  4   val 0
R1  5   val 0
R1  6   val 0
R1  7   val 0
R1  8   val 108
R1  9   val 99
R1  10  val 0
R1  11  val 0

I want to find range where values in Column 4 equal to 0. for example 4-7 and 10-11. How do I find out and print this range?

Callie
  • 333
  • 1
  • 9

1 Answers1

0

We could create a grouping variable with rleid and if all the values in 'C4' is 0, then get the range of 'C2'

library(data.table)
setDT(df1)[, if(all(C4==0)) range(C2), rleid(C4 == 0)]$V1
#[1]  4  7 10 11

If we need it as a range string

setDT(df1)[, if(all(C4==0)) paste(range(C2), collapse=":"), rleid(C4 == 0)]$V1
#[1] "4:7"   "10:11"

Or using tidyverse

library(tidyverse)
df1 %>%
   group_by(grp = cumsum(c(TRUE, diff(C4 != 0) < 0))) %>% 
   filter(C4 == 0) %>% 
   summarise(Range = list(range(C2))) %>%
   unnest

NOTE: If needed, include 'C1' also in the group_by

akrun
  • 874,273
  • 37
  • 540
  • 662