1

I am trying to recode some psychometric scales for scoring in R. Often the scales will come in the form of a factor that will need to be converted to a number to calculate the score; for example ("Never" = 0, "Sometimes" = 1, "Always" = 2).

I am having limited success in scoring specific numbers. If the scale starts from 1 (e.g. "Never" = 1, "Sometimes" = 2, "Always" = 3) then everything seems to work okay, however if the scale starts from 0 (or some other number other than 1), the conversion to numeric doesn't go as expected. I have found a temporary solution, but it is rather cumbersome as I need to convert to a factor, then character and finally to numeric.

What I am trying to do is have R assign a number to each specific level of the factor and then return the number when converting to numeric. For example if I want "Never" = 0, "Sometimes" = 1 and "Always" = 2 then R would return:

> answers <- c("Never", "Sometimes", "Always", "Always", "Sometimes", "Never")
> some_function(answers)
[1] 0 1 2 2 1 0

My temporary and less-than-ideal solution is do do the following:

> as.numeric(as.character(fct_recode(as_factor(answers),
+                             "0" = "Never",
+                             "1" = "Sometimes",
+                             "2" = "Always")))
[1] 0 1 2 2 1 0

If I try to run the above code without converting to character then it doesn't return what I am after:

> as.numeric(fct_recode(as_factor(answers),
+                              "0" = "Never",
+                              "1" = "Sometimes",
+                              "2" = "Always"))
[1] 1 2 3 3 2 1

Does anyone know how I can more efficiently convert a factor variable numeric and assign specific numeric values to the levels of the factors?

Thanks!

Aaron
  • 15
  • 4

2 Answers2

2

In R, indexing starts from 1 and the factor values are also stored as integer. So, when we coerce to integer with as.integer, it returns the index from 1 only. We may use a named vector to match and replace

unname(setNames(0:2, c("Never", "Sometimes", "Always"))[answers])

-output

[1] 0 1 2 2 1 0

If we can return a factor as well, then both levels and the corresponding labels can be specified in factor call

factor(answers, levels = c("Never", "Sometimes", "Always"), labels = 0:2)
[1] 0 1 2 2 1 0
Levels: 0 1 2

But, as soon as it is coerced to integer, the integer storage values will be starting from 1

as.integer(factor(answers, levels = c("Never", "Sometimes", 
         "Always"), labels = 0:2))
[1] 1 2 3 3 2 1

Instead, we may also type convert

type.convert(factor(answers, levels = c("Never", "Sometimes", 
      "Always"), labels = 0:2), as.is = TRUE)
[1] 0 1 2 2 1 0
akrun
  • 874,273
  • 37
  • 540
  • 662
1

You can define the correct order that you are looking for and use match. This would return you values like 1, 2, 3 but you can subtract 1 to get required order starting from 0.

answers <- c("Never", "Sometimes", "Always", "Always", "Sometimes", "Never")
order <- c('Never', 'Sometimes', 'Always')
match(answers, unique(answers)) - 1
#[1] 0 1 2 2 1 0
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213