2

I am looking for a way to (cleanly) recode a variable with a range of values into an ordered factor in R tidyverse way. This is an example of what I have:

set.seed(1505)
n <- 100
df <- data.frame(id=1:n,
               education=sample(1:5, n, replace=TRUE))

df %>%
  mutate(education_recoded=case_when(education %in% 1:3 ~ "Up to secondary school",
                                     education %in% 4:5 ~ "University studies or higher"),
         education_recoded=ordered(education_recoded, levels = c("Up to secondary school", "University studies or higher"))) %>%
  as_tibble()

Is there a way to do this in one line, so that I don't have to repeat labels in the ordered() function? As far as I can tell, I am looking for something like recode_factor() that can work with a range of values.

Thanks a lot!

Peter
  • 11,500
  • 5
  • 21
  • 31
Helena
  • 23
  • 3

1 Answers1

2

Using cut with labels:

df$education_recoded <- cut(df$education, breaks = c(0,3,5), 
                            labels = c("Up to secondary school", 
                                       "University studies or higher"))
# compare the values
table(df$education_recoded, df$education)
#                               1  2  3  4  5
# Up to secondary school       21 24 24  0  0
# University studies or higher  0  0  0 15 16

Or using pipes:

library(dplyr)  

df %>% 
  mutate(education_recoded = cut(education, breaks = c(0,3,5), 
                                 labels = c("Up to secondary school", 
                                            "University studies or higher")))
zx8754
  • 52,746
  • 12
  • 114
  • 209
  • 1
    Thanks a lot. I read up on cut() and saw that I can also just add ordered_result = TRUE to make it an ordered variable! – Helena Jan 25 '22 at 11:44