Changing (convert) letters (string) to numbers using R

Question

I am trying to convert pairs of letters (genotype) like AA, GG, GA to numerical values. So for example I would like AA = 0, GG = 1, AG = 2, CC = 3, TT = 4 etc. A sample of my data looks like this:

S1 S2 S3
AA CC AA
AA GG TT
AA CC GG
AA AG AA

I have been trying to use the mutate function in dplyr package, but I am kinda stuck.

The code that I have been running that gives me an error is:

DF1 <- DF %>% mutate_each(funs(chartr("AA", "0", .)))

Error in chartr("AA", "0", c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L)) : 'old' is longer than 'new'

I tried to then edit the code to:

DF1 <- DF %>% mutate_each(funs(chartr("AA", "00", .)))

Which gave me the results below but it's still not what I want it to do. Can someone please help me out with some ideas how to deal with it?

S1 S2 S3
1 00 CC 00
2 00 GG TT
3 00 CC GG
4 00 0G 00

My desired results is:

I think there's an error in your desired results since row 4 of `S2` would be `AG` == `2` vs `1` that you have. — hrbrmstr, Sep 30 '15 at 21:16

score 1 · Answer 1 · answered Sep 30 '15 at 21:14

dat <- read.table(text="S1 S2 S3
AA CC AA
AA GG TT
AA CC GG
AA AG AA", header=TRUE, stringsAsFactors=FALSE)

Assuming a finite translation table:

xlate <- c(AA = 0, GG = 1, AG = 2, CC = 3, TT = 4)

dat[] <- lapply(dat, function(x) { xlate[x] })

dat
##   S1 S2 S3
## 1  0  3  0
## 2  0  1  4
## 3  0  3  1
## 4  0  2  0

Changing (convert) letters (string) to numbers using R

1 Answers1