4

In the following data the levels for both variables are coded numerically

dat = read.csv("https://studio.edx.org/c4x/HarvardX/PH525.1x/asset/assoctest.csv")
head(dat)

I am replacing these codes with character strings to make for easier reading and graphing. This I am able to do successfully using the dplyr mutate function.

dat_char = mutate(dat, allele=replace(allele, allele==0, "AA/Aa")) %>% 
mutate(allele=replace(allele, allele==1, "aa")) %>%
mutate(case=replace(case, case==0, "control")) %>%
mutate(case=replace(case, case==1, "case"))

The above code works perfectly well, but it is repetitive and fiddly to write. I am sure there is a way to perform some of these replacements simultaneously and slim down the code, but I am not sure how. For example, I have tried using vectors as the lookup and replace values.

dat_char = mutate(dat, allele=replace(allele, allele==c(0,1), c("AA/Aa", "aa"))) %>%
mutate(case=replace(case, case==c(0,1),  c("control", "case")))
head(dat_char)

This just makes a mess but it gives a sense of what I am trying to achieve.

Robert
  • 141
  • 2
  • 6

2 Answers2

9

You can use simple ifelse here but in case if you have multiple values to replace you can consider recode or case_when :

library(dplyr)

dat %>%
  mutate(allele = recode(allele, `0` = 'AA/Aa', `1` = 'aa'), 
         case = recode(case, `0` = 'control', `1` = 'case'))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Perfect - this is exactly what I knew should be possible, I just didn't know about the recode function. – Robert Jun 18 '20 at 10:19
1

This might also work:

library(dplyr)

dat_char <- mutate(dat,
                   allele = factor(allele,
                                   levels = c(0, 1),
                                   labels = c("AA/Aa", "aa")),
                   case = factor(case,
                                 levels = c(0, 1),
                                 labels = c("control", "case")))
kmacierzanka
  • 747
  • 4
  • 17