15

I have a data.frame in R where the value column contains data of the class character. I want to identify the row numbers where value changes. In the example below I want to get out 4, 7, and 9. Is there a way to do this without looping?

df <- data.frame(ind=1:10,
 value=as.character(c(100,100,100,200,200,200,300,300,400,400)), 
 stringsAsFactors=F)
df
   ind value
1    1   100
2    2   100
3    3   100
4    4   200
5    5   200
6    6   200
7    7   300
8    8   300
9    9   400
10  10   400
Gaurav Bansal
  • 5,221
  • 14
  • 45
  • 91

3 Answers3

19

A simple solution is to use the lag function in dplyr:

which(df$value != dplyr::lag(df$value))
thc
  • 9,527
  • 1
  • 24
  • 39
12

Similar to @thc's answer, but without a dependency:

which(c(FALSE, tail(df$value,-1) != head(df$value,-1)))
#[1] 4 7 9
thelatemail
  • 91,185
  • 12
  • 128
  • 188
6

You can use rle (Run Length Encoding):

cumsum(rle(df$value)$lengths)+1
[1]  4  7  9 11

You can use head to drop the last value:

head(cumsum(rle(df$value)$lengths)+1, -1)
HubertL
  • 19,246
  • 3
  • 32
  • 51