69

I need to remove the last number in a groups of vectors, i.e.:

v <- 1:3
v1 <- 4:8

should become:

v <- 1:2
v1 <- 4:7
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
Elizabeth
  • 6,391
  • 17
  • 62
  • 90
  • 1
    Possible duplicate of [R: removing the last elements of a vector](http://stackoverflow.com/questions/3753687/r-removing-the-last-elements-of-a-vector) – C8H10N4O2 Dec 02 '15 at 21:59

4 Answers4

131

You can use negative offsets in head (or tail), so head(x, -1) removes the last element:

R> head( 1:4, -1)
[1] 1 2 3
R> 

This also saves an additional call to length().

Edit: As pointed out by Jason, this approach is actually not faster. Can't argue with empirics. On my machine:

R> x <- rnorm(1000)
R> microbenchmark( y <- head(x, -1), y <- x[-length(x)], times=10000)
Unit: microseconds
                expr    min      lq median     uq     max
1   y <- head(x, -1) 29.412 31.0385 31.713 32.578 872.168
2 y <- x[-length(x)] 14.703 15.1150 15.565 15.955 706.880
R> 
Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
  • 5
    According to the source, `head` calls `length` twice, once for error checking and once for a call to `max` or `min`. Does `[` call it a total of three times? For some reason, I am having trouble finding the code that implements `[`. – Jason Morgan Aug 28 '12 at 02:48
  • 1
    +1 For the edit :) and, more importantly, reminding me that I can use `head` and `tail` this way. – Jason Morgan Aug 28 '12 at 03:04
  • 1
    @Dirk: I ran your code on my machine and using `-length()` is indeed faster. But if I increase the size of `x` to 10000, then `head` is actually about 2~3 times faster. There seems to be some dependency on the vector length but I don't have any explanation. –  Aug 28 '13 at 14:29
  • 2
    by symmetry, `tail(my_vector,-1)` can be used to remove the first element – Antoine Jan 15 '18 at 12:35
56

Use length to get the length of the object and - to remove the last one.

v[-length(v)]

A negative index in R extracts everything but the given indices.

Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
Luciano Selzer
  • 9,806
  • 3
  • 42
  • 40
11

Dirk and Iselzer have already provided the answers. Dirk's is certainly the most straightforward, but on my system at least it is marginally slower, probably because vector subsetting with [ and length checking is cheap (and according to the source, head does use length, twice actually):

> x <- rnorm(1000)
> system.time(replicate(50000, y <- head(x, -1)))
   user  system elapsed 
   3.69    0.56    4.25 
> system.time(replicate(50000, y <- x[-length(x)]))
   user  system elapsed 
  3.504   0.552   4.058

This pattern held up for larger vector lengths and more replications. YMMV. The legibility of head certainly out-weights the marginal performance improvement of [ in most cases.

Jason Morgan
  • 2,260
  • 21
  • 24
  • 4
    Yes, although legibility is a bit in the eye of the beholder. I always have to spend an extra few milliseconds thinking about what `head` and `tail` really do with negative arguments ... whereas `x[-length(x)]`, while clunky, is idiomatic to my R-soaked brain. – Ben Bolker Aug 28 '12 at 03:08
4

This is another option, which has not been suggested before. NROW treats your vector as a 1-column matrix.

v[-max(NROW(v))]#1 2
v1[-max(NROW(v1))]#4 5 6 7

Based on the discussion above, this is (slightly) faster then all the other methods suggested:

x <- rnorm(1000)
system.time(replicate(50000, y <- head(x, -1)))
user  system elapsed 
3.446   0.292   3.762
system.time(replicate(50000, y <- x[-length(x)]))
user  system elapsed 
2.131   0.326   2.472
system.time(replicate(50000, y <- x[-max(NROW(x))]))
user  system elapsed 
2.076   0.262   2.342
Elin
  • 6,507
  • 3
  • 25
  • 47
milan
  • 4,782
  • 2
  • 21
  • 39
  • 2
    But why `max()`? – s_baldur Dec 04 '18 at 08:24
  • `system.time` is not completely reliable (neither is `microbenchmark`, but it at leasts gives you a better summary). The implementation of `NROW` returns length(x) if the object is a vector. The `max()` seems to be also completely arbitrary. So given call of NROW on a vector, it is identical to `length(x)` + few more function calls. Likely, GC was triggered (or triggered more often) during the second call, which reslted in longer `system.time`. But this should not be interpreted as a general rule. – Colombo Aug 15 '22 at 01:56