0

I am doing a very simple simulation using hardy-weinberg (for all you genetics junkies) and I am having a terrible time plotting out the frequencies of allele (0,1) frequencies and finally genotypes (0,1,2) frequencies over the course of a 100 generations. I am stuck trying to figure out R's matrices.

N = 30 # Size population in each line
lineN = 100 # Number of family lines
Genes0 = array(NA, dim=c(lineN, 2*N)) 

# Randomly form genotypes by sample function / 30:70 probabilites
# In sample x=c(0:1) represents a and A (30:70) alleles of a gene
for (i in 1:lineN) {
    Genes0[i, ] = sample(x=c(0:1), size=10, replace=T, prob=c(0.3,0.7)) 
}

generationN = 100
ParentGenes = Genes0
for (g in 1:generationN) {
  ChildGenes = array(NA, dim=c(lineN, 2*N))
  for (i in 1:lineN) {
    ChildGenes[i, ] = sample(ParentGenes[i, ], replace=T)
  }
}
  ParentGenes = ChildGenes
    table(ChildGenes)/(lineN*2*N) # Allele frequencies

    #Convert allele to genotypes: AA <=> 2; Aa / aA <=> 1; aa <=> 0.
    Genotypes = array(NA, dim=c(lineN, N))
    for  (j in 1:N) {
        Genotypes[, j] = ChildGenes[, 2*j-1] + ChildGenes[, 2*j]
    }
    table(Genotypes)/(lineN*N) # Genotype frequencies.
mccurcio
  • 1,294
  • 5
  • 25
  • 44
  • `hist(Genotypes)` doesn't work? Not sure what you want. Also, what's up with all those semicolons? – rawr Feb 16 '14 at 03:24
  • What I am trying to do is graph the [allele frequency vs # generations] and [genotype frequency vs # generations]. I would like to see how the gene frequency says the same over time but the proportion of genotypes will vary over time. And you are right, Semi-colons are a bother. – mccurcio Feb 16 '14 at 03:41
  • What should the plot look like? Do you want a simple histogram like @rawr mentioned, or do you want 100 points (one for each generation)? – matt_k Feb 16 '14 at 03:44
  • Figure #1 and even #2 from http://www.nature.com/scitable/knowledge/library/natural-selection-genetic-drift-and-gene-flow-15186648 – mccurcio Feb 16 '14 at 03:47
  • Your `generationN` loop doesn't really do anything. Are you trying to make 100 subsequent generations based on a parent generation, `g`? – rawr Feb 16 '14 at 04:45
  • @rawr: Yes that is exactly what I am trying to do. – mccurcio Feb 16 '14 at 21:33
  • well as of now, it looks like you only have one generation since the loop keeps overwriting the same one. And I don't know how to move from one generation to the next. I'm sure you would have to incorporate some rules so that it is not completely random and that `g[1]` is similar to `g[50]` but definitely `g[49]` is more similar to `g[50]` than `g[1]` – rawr Feb 17 '14 at 16:41

1 Answers1

0

I don't know anything about genetics, so I'm not sure if I'm following but is this what you want:

tab <- do.call(rbind ,apply(ChildGenes, 1, function(x) table(x) / length(x)))

head(tab)

#             0         1
#[1,] 0.2500000 0.7500000
#[2,] 0.4000000 0.6000000
#[3,] 0.2833333 0.7166667
#[4,] 0.2500000 0.7500000
#[5,] 0.4833333 0.5166667
#[6,] 0.3666667 0.6333333

plot(1:100, tab[,1], col = "blue")
points(tab[,2], col = "red")
matt_k
  • 4,139
  • 4
  • 27
  • 33