16

The R language definition (for version 3.5.1) states

The expression x[] returns x, but drops “irrelevant” attributes from the result. Only names and in multi-dimensional arrays dim and dimnames attributes are retained.

But consider the following example:

v <- factor(c(dog = 1, cat = 3))
attr(v, "label") <- "feeling confused"
attributes(v)
# $`names`
# [1] "dog" "cat"
# 
# $levels
# [1] "1" "3"
# 
# $class
# [1] "factor"
# 
# $label
# [1] "feeling confused"
attributes(v[])
# $`names`
# [1] "dog" "cat"
# 
# $levels
# [1] "1" "3"
# 
# $label
# [1] "feeling confused"
# 
# $class
# [1] "factor"

Attribute order is changed but all the attributes are retained.

all.equal(attributes(v)[c(1,2,4,3)], attributes(v[]))
# [1] TRUE

Why is my example exempt? Or what am I missing?

s_baldur
  • 29,441
  • 4
  • 36
  • 69
  • 1
    Can you provide a counter example where the "irrelevant attributes" are dropped? – Mako212 Dec 19 '18 at 17:24
  • @Mako212 No, actually I can't. But haven't searched much. – s_baldur Dec 19 '18 at 17:25
  • I'm not sure what that example would be either, but I don't think we can consider the attributes of `v` to be irrelevant. Maybe this is related to S3/S4 class attributes? – Mako212 Dec 19 '18 at 17:29
  • 1
    Something related to ponder: https://stackoverflow.com/q/41191623/324364 – joran Dec 19 '18 at 17:31
  • @Mike I'm not sure I understand you. – s_baldur Dec 19 '18 at 17:37
  • 3
    Found something relevant in https://stat.ethz.ch/pipermail/r-help/2015-December/434647.html, there it stated that the R language definition lies. – s_baldur Dec 19 '18 at 17:45
  • The same people have commented here as well or is it just a copy of the link which you have shared? http://r.789695.n4.nabble.com/Question-about-a-passage-in-R-language-td4715474.html – Ronak Shah Dec 28 '18 at 05:09

1 Answers1

12

I think it may simply be mis-documented in the current R language definition document.

As you've found, the behaviour is opposite to what is described. Note that, in your example, if you subset using v[1:length(v)], you get the behaviour you expected from v[]. So the empty [] is the exception that returns the attributes unchanged.

Looking for the answer I found an illustrative commit/comment (see diffs here: https://github.com/wch/r-source/commit/6b3480e05e9671a517d70c80b9f3aac53b6afd9d#diff-3347e77b1c102d875a744a2cd7fa86e5) The author describes the behaviour that you have observed:

Subsetting (other than by an empty index) generally drops all attributes except @code{names}, @code{dim} and @code{dimnames} which are reset as appropriate. On the other hand, subassignment generally preserves attributes even if the length is changed. Coercion drops all attributes.

I think if the subset [] is empty, the object that is returned is simply a copy of the original object.

EDIT (from comments below):

The reason that the attributes of v and v[] appear in a different order, is likely because of the way the attributes are assigned to the new subset in this special case of subsetting with an empty index. Further, the different order shouldn't be considered a bug, because attributes are not supposed to have an order (see help(attributes). Note that in help(``[``), the behaviour you observed is accurately described (unlike in language definition you referenced), and explains why one would want this behaviour:

An empty index selects all values: this is most often used to replace all > the entries but keep the ‘attributes’."

E. Brown
  • 396
  • 2
  • 6
  • `v[1:length(v)]` preserves class and levels attribute though. My example suggests it is not simply a copy because the attribute order changes. – s_baldur Dec 28 '18 at 07:48
  • 6
    I think this is likely because of the way the attributes are assigned to the new subset in this special case of subsetting with an empty index. Further, it shouldn't be considered a bug, because attributes are not supposed to have an order (see `help(attributes)`. Note that in `help(``[``)`, the behaviour you observed is accurately described, and explains why one would want this behaviour. "An empty index selects all values: this is most often used to replace all the entries but keep the ‘attributes’." – E. Brown Dec 28 '18 at 16:08