-2

I have two text files: ped1.txt and ped2.txt. The field separation character is tab/space.

ped1.txt

222 333 444
333 458 458
458 774 556
500K lines...

ped2.txt

222 -12006
333 -11998

I need to recode the numbers in file 1 using key from file 2, for all data. Result should be like:

-12006 -11998 444
-11998    458 458
   458    774 556
500K lines...

How to do it? Thanks.

jogo
  • 12,469
  • 11
  • 37
  • 42
Zoomman
  • 37
  • 5

2 Answers2

0

With

ped1
#    V1  V2  V3
# 1 222 333 444
# 2 333 458 458
# 3 458 774 556
ped2
#    V1     V2
# 1 222 -12006
# 2 333 -11998

You can do either:

apply(ped1, c(1,2), function(x) ifelse(x %in% ped2$V1, ped2$V2[ped2$V1 == x], x))
#          V1     V2  V3
# [1,] -12006 -11998 444
# [2,] -11998    458 458
# [3,]    458    774 556

or

sapply(ped1, function(x) plyr::mapvalues(x, ped2$V1, ped2$V2, FALSE))
#          V1     V2  V3
# [1,] -12006 -11998 444
# [2,] -11998    458 458
# [3,]    458    774 556

depending on your preferences.

DGKarlsson
  • 1,091
  • 12
  • 18
0

Use as.vector() to convert the first matrix into a vector.

Then use mapvalues() from plyr package or maybe even more efficient use the set() method from data.table package. The set() method requires you to cast to single column data.table after converting to vector.

When the recoding/replacements are done, you can convert back to matrix with method matrix(your_new_vector, ncol=original_number_of_cols).

Have fun

ellebaek
  • 11
  • 3