2

For example, if we have a data frame called x in R with a column which have some levels and we want to obtain that levels as strings, this should work:

levels(x$column)[x$column]

Anyone can explain me how this R syntax works?

Thanks for your help

1 Answers1

1

Consider a simple one column data frame:

df <- data.frame(x=c("a", "b", "c"))

The levels() function all the character levels for the input. Then, we subset that character vector using the level indices themselves:

levels(df$x)[df$x]
[1] "a" "b" "c"
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • I do not get that answer. Why doesn't `levels(df$x)[c("a", "b", "c")]` do the same? `df$x`does not yield indices like you say but a vector of characters. – PascalIv Sep 11 '18 at 08:22
  • @PascalIv Because in your version you're passing in a character vector, not a factor. Try `levels(df$x)[as.factor(c("a", "b", "c"))]` and it should work. – Tim Biegeleisen Sep 11 '18 at 08:23
  • Forgot about that stringsAsFactor stuff. Thanks! – PascalIv Sep 11 '18 at 08:26
  • I get what are you trying to explain, but I was expecting an answer explaining that kind of indexing, since in my opinion it isn't intuitive comparing it to another programming languages. You can index a vector longer than levels(df$x), e.g.: levels(df$x)[as.factor(c("a", "b", "c", "c"))]) It sounds pretty weird to me, maybe I'm not used to R a lot so that's the reason... But thanks for your help anyways! – Héctor Olivera Sep 11 '18 at 08:50
  • @HéctorOlivera Yes, but in that case the fourth element would return `NA`, because that level does not exist in the factor `df$x`. – Tim Biegeleisen Sep 11 '18 at 08:52
  • I've corrected it, I was thinking on "c" in the last element as you can see. But yes, I get it :) – Héctor Olivera Sep 12 '18 at 10:55