2

I want replace Sire with new Id if Dam is NOT 0. And after that add a new row each time with new Id and Sex.

For example, I need to replace 0 in first row as s1073 and add a new row in data as 1 s1073 0 0 2. Similarly if Dam is 0 and Sir is NOT 0 then adding new row in data set, for instance in row 7, need to reaplce Dam 0 with d900 and adding new row in data frame as 1 d900 0 0 2.

Can anyone please help me sort this out?

FID  ID Sire  Dam  Sex
1 1832    0 1073   1
1 1833 1201 1251   2
1 1834   15  560   1
1 1835 1598 1583   1
1 1836    0   13   1
1 1837 1107  562   1
1 1838  900    0   1
1 1839  900  571   2
1 1840  900    0   1
1 1841    0  415   1
1 1842    0    0   2
1 1843 1201  303   2
1 1844    0    0   1
1 1845 1107  557   2
1 1846   15  749   2
zx8754
  • 52,746
  • 12
  • 114
  • 209
user2808642
  • 95
  • 1
  • 9

2 Answers2

2

I am guessing this is a plink FAM format, and some individuals are missing a father or a mother, and we want to add missing parent for individuals that have at least one of the parent, if both missing then do not add parents.

# dummy fam data with missing parents
df1 <- read.table(text = "FID   IID Father  Mother  Sex
1   1   0   2   1
                  1 2   0   0   2
                  1 3   0   2   1
                  1 4   0   2   2
                  2 1   3   0   1
                  2 2   3   0   2
                  2 3   0   0   1
                  3 1   0   0   1
                  4 1   0   0   1
                  4 2   0   0   2
                  4 3   1   2   2
                  4 4   1   2   2
                  ", header = TRUE, 
                  colClasses = "character")

Note, about dummy data:
- FID == 1 is missing a father
- FID == 2 is missing a mother
- FID == 3 is a single individual family with no parents
- FID == 4 is no missing parents

Task, add missing Father or Mother only if one of them is missing. i.e.: if both missing Father == 0 and Mother == 0, then do not add parents.

library(dplyr) # using dplyr for explicity of steps.

# update 0 to IID for missing Father and Mother with suffix f and m
df1 <- 
  df1 %>% 
  mutate(
    FatherNew = if_else(Father == "0" & Mother != "0", paste0(Mother, "f", IID), Father),
    MotherNew = if_else(Mother == "0" & Father != "0", paste0(Father, "m", IID), Mother))

# add missing Fathers
missingFather <- df1 %>% 
  filter(
    FatherNew != "0" &
      MotherNew != "0" &
      !FatherNew %in% df1$IID) %>% 
  transmute(
    FID = FID,
    IID = FatherNew,
    Father = "0",
    Mother = "0",
    Sex = "1") %>%
  unique


# add missing Mothers
missingMother <- df1 %>% 
  filter(
    FatherNew != "0" &
      MotherNew != "0" &
      !MotherNew %in% df1$IID) %>% 
  transmute(
    FID = FID,
    IID = MotherNew,
    Father = "0",
    Mother = "0",
    Sex = "2") %>%
  unique

# update new Father/Mother IDs
res <- df1 %>% 
  transmute(
    FID = FID,
    IID = IID,
    Father = FatherNew,
    Mother = MotherNew,
    Sex = Sex)

# add missing Fathers/Mothers as new rows, and sort
res <- rbind(
  res,
  missingFather,
  missingMother) %>%
  arrange(FID, IID)

Result, check output

res
#    FID IID Father Mother Sex
# 1    1   1    2f1      2   1
# 2    1   2      0      0   2
# 3    1 2f1      0      0   1
# 4    1 2f3      0      0   1
# 5    1 2f4      0      0   1
# 6    1   3    2f3      2   1
# 7    1   4    2f4      2   2
# 8    2   1      3    3m1   1
# 9    2   2      3    3m2   2
# 10   2   3      0      0   1
# 11   2 3m1      0      0   2
# 12   2 3m2      0      0   2
# 13   3   1      0      0   1
# 14   4   1      0      0   1
# 15   4   2      0      0   2
# 16   4   3      1      2   2
# 17   4   4      1      2   2
zx8754
  • 52,746
  • 12
  • 114
  • 209
  • That's fantastic. My 90% problem is solved. Just last one more question - some of the fathers or mothers comes more than 1, and i want a separate id's for them. For example in above example Father '3' comes twice so i need mothers as 3m1, 3m2.....and in same fasion IID should adjusted as well.... Do you know i can do that? – user2808642 Oct 11 '16 at 16:00
  • @user2808642 but then we are saying siblings are from the same Father (3) and from 2 different Mothers (3m1 and 3m2)? – zx8754 Oct 11 '16 at 21:02
  • @user2808642 It would be nice, if you could copy my dummy data to your post, and add expected output. – zx8754 Oct 11 '16 at 21:04
  • Thanks, actually I am not worrying about sibling here, this is sheep pedigree data, which needed to run some other simulation. The output of your dummy data is exactly the one you showed in 'res'. could you please help me to add 3m1, 3m2...... as mentioned above? – user2808642 Oct 17 '16 at 09:31
  • @user2808642 I think this should be as easy as pasting `IID` to new parents, updated the post, see if this is what you need. – zx8754 Oct 17 '16 at 21:19
-1

I think this answer was very useful for me to figure out missing off-spring. Thanks!

user2808642
  • 95
  • 1
  • 9
  • 1
    Please read [How does accepting an answer work?](http://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work). And delete this answer. – zx8754 Oct 20 '16 at 11:49