3

I have a vector:

as <- c(1,2,3,4,5,9)

I need to extract the first continunous sequence in the vector, starting at index 1, such that the output is the following:

1 2 3 4 5

Is there a smart function for doing this, or do I have to do something not so elegant like this:

a <- c(1,2,3,4,5,9)
is_continunous <- c()
for (i in 1:length(a)) {
  if(a[i+1] - a[i] == 1) {
    is_continunous <- c(is_continunous, i)
  } else {
    break
  }
}

continunous_numbers <- c()
if(is_continunous[1] == 1) {
  is_continunous <- c(is_continunous, length(is_continunous)+1)
  continunous_numbers <- a[is_continunous]
}

It does the trick, but I would expect that there is a function that can already do this.

Sotos
  • 51,121
  • 6
  • 32
  • 66
Esben Eickhardt
  • 3,183
  • 2
  • 35
  • 56

2 Answers2

5

It isn't clear what you need if the index of the continuous sequence only if it starts at index one or the first sequence, whatever the beginning index is.

In both case, you need to start by checking the difference between adjacent elements:

d_as <- diff(as)

If you need the first sequence only if it starts at index 1:

if(d_as[1]==1) 1:(rle(d_as)$lengths[1]+1) else NULL
# [1] 1 2 3 4 5

rle permits to know lengths and values for each consecutive sequence of same value.

If you need the first continuous sequence, whatever the starting index is:

rle_d_as <- rle(d_as)
which(d_as==1)[1]+(0:(rle_d_as$lengths[rle_d_as$values==1][1]))

Examples (for the second option):

as <- c(1,2,3,4,5,9) 
d_as <- diff(as)
rle_d_as <- rle(d_as)
which(d_as==1)[1]+(0:(rle_d_as$lengths[rle_d_as$values==1][1]))
#[1] 1 2 3 4 5

as <- c(4,3,1,2,3,4,5,9)
d_as <- diff(as)
rle_d_as <- rle(d_as)
which(d_as==1)[1]+(0:(rle_d_as$lengths[rle_d_as$values==1][1]))
# [1] 3 4 5 6 7

as <- c(1, 2, 3, 6, 7, 8)
d_as <- diff(as)
rle_d_as <- rle(d_as)
which(d_as==1)[1]+(0:(rle_d_as$lengths[rle_d_as$values==1][1]))
# [1] 1 2 3
Cath
  • 23,906
  • 5
  • 52
  • 86
3

A simple way to catch the sequence would be to find the diff of your vector and grab all elements with diff == 1 plus the very next element, i.e.

d1<- which(diff(as) == 1)
as[c(d1, d1[length(d1)]+1)]

NOTE

This will only work If you only have one sequence in your vector. However If we want to make it more general, then I 'd suggest creating a function as so,

get_seq <- function(vec){
  d1 <-  which(diff(as) == 1)
  if(all(diff(d1) == 1)){
    return(c(d1, d1[length(d1)]+1))
  }else{
    d2 <- split(d1, cumsum(c(1, diff(d1) != 1)))[[1]]
    return(c(d2, d2[length(d2)]+1))
  }
}


#testing it

as <- c(3, 5, 1, 2, 3, 4, 9, 7, 5, 4, 5, 6, 7, 8)
get_seq(as)
#[1] 3 4 5 6

as <- c(8, 9, 10, 11, 1, 2, 3, 4, 7, 8, 9, 10)
get_seq(as)
#[1]  1 2 3 4

as <- c(1, 2, 3, 4, 5, 6, 11)
get_seq(as)
#[1] 1 2 3 4 5 6
Sotos
  • 51,121
  • 6
  • 32
  • 66
  • 3
    Thanks for the answer. I didn't mean to steal your answer, but I just wanted to let people know that I had gotten a satisfactory answer and that my question was answered. – Esben Eickhardt Jul 13 '17 at 09:47
  • @EsbenEickhardt no worries. You did good. I know you meant well :) – Sotos Jul 13 '17 at 09:50
  • @EsbenEickhardt made a function that will work in any case – Sotos Jul 13 '17 at 10:13
  • Cool, thanks! I am sure this will come in handy in some other case. – Esben Eickhardt Jul 13 '17 at 10:31
  • @EsbenEickhardt just to clarify. You need to return the indices of the continuous values, not values themselves right? (Your example is not clear as you have values 1, 2, 3, 4, 5 at indices 1, 2, 3, 4, 5) :) – Sotos Jul 13 '17 at 10:32