0

I received some data from a colleague who is working with animal observations recorded in several transects. However my colleague used the same three ID codes for identifying each transect: 1, 7, 13 and 19. I would like to replace the repeated IDs with unique IDs. This image shows what I want to do:

enter image description here

Here's the corresponding code:

example_data<-structure(list(ID_Transect = c(1L, 1L, 1L, 1L, 1L, 1L, 7L, 7L, 
                                             7L, 7L, 7L, 7L, 13L, 13L, 13L, 13L, 13L, 13L, 19L, 19L, 19L, 
                                             19L, 19L, 19L, 1L, 1L, 1L, 1L, 1L, 1L, 7L, 7L, 7L, 7L, 7L, 7L), 
                             transect_id = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
                   2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 
                   5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L)), class = "data.frame", row.names = c(NA, 
                                                                                             -36L))
Filipe Dias
  • 284
  • 1
  • 10

2 Answers2

3

We can also do

library(data.table)
setDT(example_data)[, transect_id := rleid(ID_Transect)]
akrun
  • 874,273
  • 37
  • 540
  • 662
2

You can use data.table rleid -

example_data$transect_id <- data.table::rleid(example_data$ID_Transect)
#[1] 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 6 6 6 6 6 6

In base R we can use rle -

with(rle(example_data$ID_Transect), rep(seq_along(values), lengths))

Or diff + cumsum -

cumsum(c(TRUE, diff(example_data$ID_Transect) != 0))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213