Conditionally Revalue a Factor in R

Question

This answer may very well be obvious (I hope it is), but I kept only finding convoluted solutions. What I'd like to do is conditionally revalue a factor based on the levels of another factor.

Here's an example using the mtcars dataset:

data(mtcars)
mtcars$gear <- as.factor(mtcars$gear)
mtcars$am <- as.factor(mtcars$am)

table(mtcars$gear, mtcars$am) # examining the levels
levels(mtcars$gear)
# [1] "3" "4" "5"
levels(mtcars$am)
"0" "1"

Now among those cars with a gear level of "5", how can I assign a new "gear" level of "6" to those with an "am" level of "1", while retaining the factor levels "3","4","5" for "gear"? This is a much simpler example, but given the complexity of my dataset I'd prefer to keep the vectors as factors (and not transform to numeric and back, for example).

score 2 · Accepted Answer · answered May 09 '14 at 23:55

2

There is no "6" level in gears to begin with, so you need to create one:

levels(mtcars$gear) <- c(levels(mtcars$gear), "6")

You can then conditionally assign with the [<- function:

mtcars$gear[ mtcars$am==1 ] <- "6"
table(mtcars$gear, mtcars$am)

     0  1
  3 15  0
  4  4  0
  5  0  0
  6  0 13

You cannot assign values to a factor variable if there is no corresponding 'level' in the factor attributes.

answered May 09 '14 at 23:55

IRTFM

258,963
21
364
487

Beautiful, simple solution! (My Rube Goldberg-esque workaround entailed converting to numeric and back.) – statsRus May 10 '14 at 00:17
I have found factors to be quite error-prone. I generally prefer to leave everything as character or integer untul ready to actually do the analysis. Terry Therneau, whom I greatly respect, says that the Mayo Clinic mandates `options(stringsAsFactors=FALSE)` – IRTFM May 10 '14 at 00:20
Good workflow tip on leaving factors until the end! I'm curious -- in general what have you found error-prone with respect to factors in R? – statsRus May 10 '14 at 00:31
1

They are really (constrained) integer vectors. If I could only remember to always use `as.character()` around them I'd be a happier person. – IRTFM May 10 '14 at 00:55

Conditionally Revalue a Factor in R

1 Answers1