25

I am trying to create sequences of number of 6 cases, but with 144 cases intervals.

Like this one for example

c(1:6, 144:149, 288:293)

1   2   3   4   5   6 144 145 146 147 148 149 288 289 290 291 292 293

How could I generate automatically such a sequence with

seq 

or with another function ?

Maël
  • 45,206
  • 3
  • 29
  • 67
giac
  • 4,261
  • 5
  • 30
  • 59

6 Answers6

27

I find the sequence function to be helpful in this case. If you had your data in a structure like this:

(info <- data.frame(start=c(1, 144, 288), len=c(6, 6, 6)))
#   start len
# 1     1   6
# 2   144   6
# 3   288   6

then you could do this in one line with:

sequence(info$len) + rep(info$start-1, info$len)
#  [1]   1   2   3   4   5   6 144 145 146 147 148 149 288 289 290 291 292 293

Note that this solution works even if the sequences you're combining are different lengths.

josliber
  • 43,891
  • 12
  • 98
  • 133
  • Yeah, given the abnormal intervals in the OP's example, storing the structure explicitly like that is probably a good idea. – Frank Jun 29 '15 at 16:59
  • 1
    From the `?` page, "Note that sequence <- function(nvec) unlist(lapply(nvec, seq_len)) and it mainly exists in reverence to the very early history of R." Sort of like Frank's answer. – Carl Witthoft Jun 29 '15 at 18:40
  • @josilber - I still have to fill *manually* the `c(1, 144, 288)` ? what if I want 10 sequences or 100 sequences of 6 digits ? What would be your solution ? thanks – giac Jun 30 '15 at 06:58
  • 2
    @giacomoV if there is no pattern to the starting points and lengths, then yes of course you will need to specify them manually. If there is a pattern to the starting points and lengths then it will be easier. For instance if you wanted 100 sequences of length 6 with the starting points increasing by 144 each time starting at 0 you would use `info <- data.frame(start=seq(0, by=144, length.out=100), len=6)`. – josliber Jun 30 '15 at 14:39
  • 1
    Since R >= 4.0.0, `sequence` has now a built-in parameter `from` which helps a lot! See [here](https://stackoverflow.com/a/70581941/13460602). – Maël Jan 04 '22 at 16:35
7

Here's one approach:

unlist(lapply(c(0L,(1:2)*144L-1L),`+`,seq_len(6)))
# or...
unlist(lapply(c(1L,(1:2)*144L),function(x)seq(x,x+5)))

Here's a way I like a little better:

rep(c(0L,(1:2)*144L-1L),each=6) + seq_len(6)

Generalizing...

rlen  <- 6L
rgap  <- 144L
rnum  <- 3L

starters <- c(0L,seq_len(rnum-1L)*rgap-1L)

rep(starters, each=rlen) + seq_len(rlen)
# or...
unlist(lapply(starters+1L,function(x)seq(x,x+rlen-1L)))
Frank
  • 66,179
  • 8
  • 96
  • 180
  • 1
    I feel like there should be a more elegant solution to this problem, though. – Frank Jun 29 '15 at 16:16
  • unlist(lapply(...)) can be replaced with sapply – hedgedandlevered Jun 29 '15 at 16:21
  • @hedgedandlevered Yeah, I tried that too, but it gives a matrix... and if I do sapply with simplify=FALSE, I get back to the lapply result. Could do `c(sapply(...))`, I suppose – Frank Jun 29 '15 at 16:22
  • In OP's desired output the first interval is not equal to the others (143 vs. 144). May be an oversight on their part. – Pierre L Jun 29 '15 at 16:40
  • @plafort Yeah, I think it should start at 1, 145, 289, probably. Anyway, the adaptation to that case is pretty clear (and what I accidentally wrote in my first version of the answer): `starters <- (1:rnum-1L)*rgap` – Frank Jun 29 '15 at 16:42
  • This was my solution, but I don't know how to adjust it for the mixed intervals `(1:293)[c(rep(T,6L), rep(F, 137L))]` – Pierre L Jun 29 '15 at 16:45
  • Yeah, because they are not uniform, there is no natural extension, I guess. – Frank Jun 29 '15 at 16:47
  • Instead of `seq_len(6)` you can just use (1:6) – Mike Wise Jun 29 '15 at 16:47
  • 2
    @MikeWise Yeah, `seq_len` is just a little faster, they say, though I doubt that matters in this application. Just habit. Also, for the "generalization", I'd have to write `\`:\`(1,rlen)` which is kind of awkward. – Frank Jun 29 '15 at 16:48
  • thanks again for your help everybody ! Can I accept the answer or you think new developments are about to come ? – giac Jun 29 '15 at 16:54
  • 2
    @giacomoV There's no rush to accept. I'm not crazy about my answer, so I'd just as well see you leave the question open for a day or two and see if something better is found. – Frank Jun 29 '15 at 16:56
  • @Vlo, that's actually documented behavior under `?c`-- `c` strips a lot of attributes, and a `matrix` is pretty much a `vector` with dimensional attributes. – A5C1D2H2I1M1N2O1R2T1 Jul 02 '15 at 03:28
5

This can also be done using seq or seq.int

x = c(1, 144, 288)
c(sapply(x, function(y) seq.int(y, length.out = 6)))

#[1]   1   2   3   4   5   6 144 145 146 147 148 149 288 289 290 291 292 293

As @Frank mentioned in the comments here is another way to achieve this using @josilber's data structure (This is useful particularly when there is a need of different sequence length for different intervals)

c(with(info, mapply(seq.int, start, length.out=len)))

#[1]   1   2   3   4   5   6 144 145 146 147 148 149 288 289 290 291 292 293
Veerendra Gadekar
  • 4,452
  • 19
  • 24
  • 1
    Yeah, this is good. I find it weird that I can't get the result in integers even if I switch to `x = c(1L, 144L, 288L)`, though. I think this is a flaw in how `seq` treats `length.out`... `seq.int` seems to do the "right" thing, fortunately. – Frank Jun 29 '15 at 17:07
  • 2
    Yes, I know. The result of `seq(1L,length.out=6)` should be an integer vector is my point. I'm criticizing how the function works, not your answer (which seems the best so far to me). Your answer would work "better" (to my mind), though, if `x` were an integer and `seq.int` were used in place of `seq` so that the end result is an integer vector (like the OP's example). – Frank Jun 29 '15 at 17:14
  • You could expand the capability by defining a `sublength <- c(6,6,6)` and setting `length.out=sublength` , so that each sequence could be of a different length. – Carl Witthoft Jun 29 '15 at 18:38
  • 3
    @CarlWitthoft Unfortunately, `seq`/`seq.int` is not vectorised in the `length.out` argument. That just means `mapply` would be the way to go, though, instead of `sapply`. Using josilber's data structure, `c(with(info, mapply(seq.int, start, length.out=len)))` – Frank Jun 29 '15 at 20:14
  • I think @Frank has really hit the nail on the head here -- `seq` is not vectorized in any way at all, making it really annoying to use for these sorts of problems! – josliber Jun 30 '15 at 14:43
  • @josilber Oh, I'd figured `seq` was vectorized in some other arg, but I see that you're right. Anyway, I would still prefer the `mapply` here, to construct the sequences directly rather than arithmetically. I can read that line of code and guess what it's doing. – Frank Jun 30 '15 at 14:52
  • 1
    @Frank seems like a matter of preference then :) – josliber Jun 30 '15 at 14:54
3

From R >= 4.0.0, you can now do this in one line with sequence:

sequence(c(6,6,6), from = c(1,144,288))
[1]   1   2   3   4   5   6 144 145 146 147 148 149 288 289 290 291 292 293

The first argument, nvec, is the length of each sequence; the second, from, is the starting point for each sequence.

As a function, with n being the number of intervals you want:

f <- function(n) sequence(rep(6,n), from = c(1,144*1:(n-1)))
f(3)
[1]   1   2   3   4   5   6 144 145 146 147 148 149 288 289 290 291 292 293
Maël
  • 45,206
  • 3
  • 29
  • 67
0

I am using R 3.3.2. OSX 10.9.4

I tried:

a<-c()  # stores expected sequence
f<-288  # starting number of final sub-sequence
it<-144 # interval
for (d in seq(0,f,by=it))
{
    if (d==0)
    {
        d=1
    }
    a<-c(a, seq(d,d+5))
    print(d)
}
print(a)

AND the expected sequence stores in a.

[1] 1 2 3 4 5 6 144 145 146 147 148 149 288 289 290 291 292 293

And another try:

a<-c()  # stores expected sequence
it<-144 # interval
lo<-4   # number of sub-sequences
for (d in seq(0,by=it, length.out = lo))
{
    if (d==0)
    {
        d=1
    }
    a<-c(a, seq(d,d+5))
    print(d)
}
print(a)

The result:

[1] 1 2 3 4 5 6 144 145 146 147 148 149 288 289 290 291 292 293 432 433 434 435 436 437

Nick Dong
  • 3,638
  • 8
  • 47
  • 84
0

I tackled this with cumsum function

seq_n <- 3 # number of sequences
rep(1:6, seq_n) + rep(c(0, cumsum(rep(144, seq_n-1))-1), each = 6)
# [1]   1   2   3   4   5   6 144 145 146 147 148 149 288 289 290 291 292 293

No need to calculate starting values of sequences as in the @josilber's solution, but the length of a sequence has to be constant.

Dominik Lenda
  • 41
  • 1
  • 8