15

Say I have a vector, for example, x <- 1:10, then x[0] returns a zero-length vector of the same class as x, here integer(0).

I was wondering if there is a reason behind that choice, as opposed to throwing an error, or returning NA as x[11] would? Also, if you can think of a situation where having x[0] return integer(0) is useful, thank you for including it in your answer.

buhtz
  • 10,774
  • 18
  • 76
  • 149
flodel
  • 87,577
  • 21
  • 185
  • 223
  • 1
    For the same reason x[FALSE] does – Tyler Rinker May 16 '12 at 01:10
  • 3
    @TylerRinker: Prima facie, it's an interesting question, as `x[0]`, which has not been explicitly defined, returns `integer(0)`, while `x[11]`, which has also not been explicitly defined, returns `NA`. Also, explicitly assigning `x[0] <- 5` returns no error or warning, but `x[0]` is still `integer(0)`. – jthetzel May 16 '12 at 01:28
  • @TylerRinker, I think I understand why `x[FALSE]` returns a zero-length vector: when extracting with logicals (TRUE/FALSE), one needs to provide a vector of the same length as `x`. So in your example, `FALSE` is recycled resulting in `x[rep(FALSE, length(x))]` and I am not surprised it returns `integer(0)`. The same way, `x[TRUE]` will return `x`. But I don't see the link you make between `x[0]` and `x[FALSE]`. Can you please elaborate? – flodel May 16 '12 at 01:32
  • 1
    This is a result of one-based indexing as opposed to [zero-based indexing](http://en.wikipedia.org/wiki/Zero-based_numbering), but is understandably unexpected. – jthetzel May 16 '12 at 01:35
  • @Flodel it was more of an off handed remark (a poor attempt at humor). I'm just as curious as to the answer (if there is one) as you are. – Tyler Rinker May 16 '12 at 01:48
  • 2
    Since you can't mix +ve and -ve indices, ignoring 0 might be the right answer: `x[0:11]` returns ` [1] 1 2 3 4 5 6 7 8 9 10 NA` and `x[-(0:5)]` returns `[1] 6 7 8 9 10`. But I can't see how this would really be useful. – Matthew Lundberg May 16 '12 at 02:15

3 Answers3

10

As seen in ?"["

NA and zero values are allowed: rows of an index matrix containing a zero are ignored, whereas rows containing an NA produce an NA in the result.

So an index of 0 just gets ignored. We can see this in the following

x <- 1:10
x[c(1, 3, 0, 5, 0)]
#[1] 1 3 5

So if the only index we give it is 0 then the appropriate response is to return an empty vector.

Dason
  • 60,663
  • 9
  • 131
  • 148
  • Thank you @Dason. I think it is worth noting that the sentence you quote is part of the "Matrices and Arrays" section, and not "Atomic Vectors", so it is not explicitly documenting or answering my question, but maybe it can be used for extrapolation. – flodel May 16 '12 at 03:22
  • 1
    Either way, although it can help the discussion, I am not so much interested in a "because the documentation says so" answer; I am more interested about why this particular choice was made. Some of the other answers and comments offer some plausible explanation as to why it should not return `NA` like `x[11]` does. So I am left wondering why the developers did not chose to throw an error instead. – flodel May 16 '12 at 03:22
  • An error would be bad because in cases like the split you want it to return something not an error. :) still pushing this idea I made up that sounds plausible. – Tyler Rinker May 16 '12 at 13:47
  • @TylerRinker Do you know that it's using 0 indexing in a split though? I doubt that's how the internals of split work. – Dason May 16 '12 at 15:28
  • 1
    @Dason Come on you and I both know I don't know that. – Tyler Rinker May 16 '12 at 15:37
  • @TylerRinker and @Dason, my intuition about `split`, is that `integer(0)` would be the result of evaluating `x[integer(0)]` rather than `x[0]`. Let's take your `with(mtcars, split(gear, list(cyl, am, carb)))` example which returned `numeric(0)` for {cyl=6, am=1, carb=8}.... – flodel May 18 '12 at 01:36
  • ...What I am saying is that `split` would first look for matches in the data where {cyl=6, am=1, carb=8} which would return `integer(0)` (and not `0`), then do mtcars$gear[integer(0)]. Also notice that `which(5 == 1:4)` returns `integer(0)` and not `0`. – flodel May 18 '12 at 01:38
  • This being said, I have no problem with `x[integer(0)]` returning a zero-length vector, I even think it makes perfect sense. As far as `x[0]` is concerned, I am still puzzled why the developers made that choice; I think throwing an error would have been a better choice. – flodel May 18 '12 at 01:47
  • @flodel - I agree with you and that's why I was asking Tyler if he knew if it was using 0 indexing because I figured what was going on behind the scenes was probably closer to what you described. I wish I had a better answer for you other than "that's what the documentation specifies" because it is an interesting question and it would be nice to hear the core developers justification. It might just be because that's what they originally did and didn't think about it too much and now it's that way for compatibility reasons. But there might be a better reason... – Dason May 18 '12 at 06:35
2

My crack at it as I am not a programmer and certainly do not contribute to R source. I think it may be because you need some sort of place holder to state that something occurred here but nothing was returned. This becomes more apparent with things like tables and split. For instance when you make a table of values and say there are zero of that cell you need to hold that that cell made from a string in a vector has no values. it would not be a appropriate to have x[0]==0 as it's not the numeric value of zero but the absence of any value.

So in the following splits we need a place holder and integer(0) holds the place of no values returned which is not the same as 0. Notice for the second one it returns numeric(0) which is still a place holder stating it was numeric place holder.

with(mtcars, split(as.integer(gear), list(cyl, am, carb)))
with(mtcars, split(gear, list(cyl, am, carb)))

So in a way my x[FALSE] retort is true in that it holds the place of the non existent zero spot in the vector.

All right this balonga I just spewed is true until someone disputes it and tears it down.

PS page 19 of this guide (LINK) state that integer() and integer(0) are empty integer.

Related SO post: How to catch integer(0)?

Community
  • 1
  • 1
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
  • 1
    I still don't feel this answers the question of why it doesn't return NA like x[11] does though. – Dason May 16 '12 at 02:15
  • 1
    Because `x[11]` does exist it's just missing. Try `x[15] <- 3;x` and you'll see in a vector that every integer value technically exists except 0. Even `x[Inf]` exists it's just missing (`NA`) where as `x[0]` can never exist. Think of the `x[n]` as positioning and it makes sense. There is no such thing as `x[0]` position. – Tyler Rinker May 16 '12 at 02:21
  • @TylerRinker: I agree with you, but just to further the discussion, if x[0] does not exist, why does `x[0] <- 1` not return a warning or error? Wouldn't we expect it to return something like `invalid first argument`, which is the error message returned by `assign(x[0], 1)`? – jthetzel May 16 '12 at 02:28
  • 1
    @jthetzel That's not exactly a good example since `assign(x[1], 1)` gives invalid first argument as well. However both `assign("x[1]", 1)` and `assign("x[0]", 1)` succeed. – Dason May 16 '12 at 02:30
  • @Dason: Thanks for pointing that out. Still, out of curiosity, I wonder why there is no error for `x[0] <- 1`. – jthetzel May 16 '12 at 02:37
  • @Dason that works but doesn't actually change the value in x as you've just assigned the value to a character `"x[1]"`. Check this out: `assign('x[1]', 5); x` Wouldn't you expect the first position of x to be 5. it ain't. – Tyler Rinker May 16 '12 at 02:37
  • @jthetzel not sure but this: `interger(0) <- 4` turns up an error which oddly `identical(integer(0), x[0])` these two are equal. I'd actually expect the error too. – Tyler Rinker May 16 '12 at 02:40
  • @TylerRinker I realized that about the assign statements a little after posting but since it was more than five minutes later I couldn't modify my comment. I don't see why you would expect an error with that `identical` statement though. – Dason May 16 '12 at 02:49
  • No I wasn't saying an error, I'm saying that the two are identical and you can assign to one and the other throws up an error. PS we're way above my pay grade now. – Tyler Rinker May 16 '12 at 02:58
  • Oh I see what you're saying. But identical is just testing if they're equal in value. We can get a similar result that hopefully makes more sense by doing `x <- 1; identical(x[1], 1); 1 <- 1`. Clearly we can assign to x but we can't assign to 1. But x[1] and 1 are the same value. The value a reference contains isn't the same as the reference itself. – Dason May 16 '12 at 03:03
  • @TylerRinker: Consider ``assign('x[1]', 5); `x[1]` `` – jthetzel May 16 '12 at 03:11
  • @jthetzel that's no different than `assign('d', 5); 'd'`. The indexing has nothing to do with nothing as `'x[1]'` is just a character string. it's not any different han: `assign('%', 5); '%'; %`. Certain symbols (including `[`) can't be overwritten in R but a character string containing them can be. – Tyler Rinker May 16 '12 at 03:33
2

Since the array indices are 1-based, index 0 has no meaning. The value is ignored as a vector index.

Matthew Lundberg
  • 42,009
  • 6
  • 90
  • 112