2

Recently I've stumbled upon this bit of code:

y <- NULL

y[cbind(1:2, 1:2)] <- list( list(1,2), list(2,3))

From the second answer here.

But it doesn't seem to differ from y <- list(...), as the comparisons below show:

> identical(y, list( list(1,2), list(2,3)))
[1] TRUE
> identical(y, y[cbind(1:2, 1:2)])
[1] FALSE

What is going on in the bracket assignment here? Why it doesn't throw an error? And why is it different from the non-assigment version in the last line of code?

Community
  • 1
  • 1
Ferdinand.kraft
  • 12,579
  • 10
  • 47
  • 69
  • Isn't it similar to `y[1:2] <- list( list(1,2), list(2,3) )`. I don't see the function of `cbind` here. – Arun Aug 04 '13 at 14:46
  • @Arun, the function of `cbind` is masked by the repetition of `1:2`. If instead we use `cbind(1:2, 2:3)` then it will have an effect (although no different than from `c`) – Ricardo Saporta Aug 04 '13 at 15:00
  • Ricardo, give me a moment :) – Arun Aug 04 '13 at 15:04

1 Answers1

2

Matrix indexing only applies when y has dim. Combine this with standard R recycling and the fact that all matrices are actually vectors, and this behavior makes sense.

When you initialize y to NULL, you ensure it has no dim. Therefore, when you index y by a matrix, say ind, you get the same results as having called y[as.vector(ind)]

identical(y[ind], y[as.vector(ind)])
# [1] TRUE

If there are repeat values in ind and you are also assigning, then for each index, only the last value assigned ot it will remain. For example Lets assume we are executing

y <- NULL; y[cbind(1:2, 2:1)] <- list( list(1,2), list(3,4) )
#   y has no dimension, so `y[cbind(1:2, 2:1)]` 
#   is the equivalent of   `y[c(1:2, 2:1)]`

When you assign y[c(1, 2, 2, 1)] <- list("A", "B") , in effect what happens is analogous to:

    y[[1]] <- "A"
    y[[2]] <- "B"
    y[[2]] <- "B"  # <~~ 'Overwriting' previous value 
    y[[1]] <- "A"  # <~~ 'Overwriting' previous value 

Here is a further look at the indexing that occurs: (Notice how the first two letters are being repeated)

ind <- cbind(1:2, 1:2)
L <- as.list(LETTERS)
L[ind]
# [[1]]
# [1] "A"
# 
# [[2]]
# [1] "B"
# 
# [[3]]
# [1] "A"
# 
# [[4]]
# [1] "B"

Here is the same thing, now with assignment. Notice how only the 3rd and 4th values being assigned have been kept.

L[ind] <- c("FirstWord", "SecondWord", "ThirdWord", "FourthWord")
L[ind]
# [[1]]
# [1] "ThirdWord"
# 
# [[2]]
# [1] "FourthWord"
# 
# [[3]]
# [1] "ThirdWord"
# 
# [[4]]
# [1] "FourthWord"

Try a different index for further clarity:

ind <- cbind(c(3, 2), c(1, 3))  ## will be treated as c(3, 2, 1, 3) 
L <- as.list(LETTERS)
L[ind] <- c("FirstWord", "SecondWord", "ThirdWord", "FourthWord")
L[1:5]
#  [[1]]
#  [1] "ThirdWord"
#  
#  [[2]]
#  [1] "SecondWord"
#  
#  [[3]]
#  [1] "FourthWord"
#  
#  [[4]]
#  [1] "D"
#  
#  [[5]]
#  [1] "E"

L[ind]
#  [[1]]
#  [1] "FourthWord"
#  
#  [[2]]
#  [1] "SecondWord"
#  
#  [[3]]
#  [1] "ThirdWord"
#  
#  [[4]]
#  [1] "FourthWord"

Edit regarding @agstudy's questions:

Looking at the src for [ we have the following comments:

  • The special [ subscripting where dim(x) == ncol(subscript matrix)
  • is handled inside VectorSubset. The subscript matrix is turned
  • into a subscript vector of the appropriate size and then
  • VectorSubset continues.

Looking at the function static SEXP VectorSubset(SEXP x, SEXP s, SEXP call) the relevant check is the following:

/* lines omitted */ 
attrib = getAttrib(x, R_DimSymbol);
/* lines omitted */
if (isMatrix(s) && isArray(x) && ncols(s) == length(attrib)) {
    /* lines omitted */
...
Ricardo Saporta
  • 54,400
  • 17
  • 144
  • 178
  • Ricardo, `unlist` on a matrix has no effect. – Arun Aug 04 '13 at 14:57
  • @Arun, I meant `as.vector` fixed it. (thanks for pointing it out) – Ricardo Saporta Aug 04 '13 at 14:59
  • I think it's a bit confusing to talk about subsetting here. Your first code part just below *This perhaps can make it more clear*. That's what confused me at least. Probably, it's best to stick to just the assignment part? Just my thought. – Arun Aug 04 '13 at 15:17
  • Arun, how ironic that the _This can make it more clear_ is what is confusing, hahah. Let me see if I can give it an edit – Ricardo Saporta Aug 04 '13 at 15:18
  • :) sorry, I dint mean to. The underlying concept is recycling put to good use, but the way it gets overwritten during assignment is probably less intuitive to begin with. And to me, showing the subsetting first swayed me off... – Arun Aug 04 '13 at 15:20
  • 1
    @Arun, I was laughing at the irony because it was completely true. I restructured my answer – Ricardo Saporta Aug 04 '13 at 15:24