0

I'm a newbie to R, so please be patient.

I have 4 populations of one species(pop in 1 column) , with diferent sample sizes.Having analysed mitDNA haplotypes (hap in another column) for each individual (each row) I now need to standardize the sample size for all populations and estimate FST. I supose this may simply be done with some resampling method with replicates in order to obtain a random subsample within populations, but I'm not sure what's the proper way to do it.

Also, I have never estimated Fst in R. Is there a way to obtain the output in a table similar to the one below?Or else directly apply the results of the subsampling in a package in R that may estimate Fst after this procedure?

df <- read.table(text="pop  hap
1   A
1   A
1   B
1   B
1   B
1   C
1   D
2   F
2   F
2   A
2   A
2   B
2   E
3   A
3   A
3   B
3   D
4   A
4   A
4   A
4   B
4   B", header=TRUE)

Hope I made myself clear.

Thank you very much, in advance.

erasmortg
  • 3,246
  • 1
  • 17
  • 34
Silvia
  • 11
  • 2
  • 2
    I think you forgot to add your output table of reference – erasmortg Jul 30 '15 at 17:44
  • I don't know how the table you pasted in your comment is supposed to look like. Please edit your question accordingly. As for the calculation, perhaps the package `PopGenome` could help you: https://cran.r-project.org/web/packages/PopGenome/PopGenome.pdf – erasmortg Jul 30 '15 at 18:12
  • Sorry erasmortg ! You're right. It would be similar to this (a simple example): pop hap 1 A 1 B 1 B 1 C 1 D 2 F 2 F 2 A 2 A 2 B 2 E 3 A 3 B 3 D 4 A 4 A 4 B 4 B is actually similar to the input file. Just corrected in the main question but it seems that is not keeping the style...Is one column with pop, one column with the correspondent randomly susampled haplotypes within population and each row is an individual. – Silvia Jul 30 '15 at 18:14
  • Reading into the package, this is way outside my area of expertise, but please take a look at this: https://cran.r-project.org/web/packages/PopGenome/vignettes/An_introduction_to_the_PopGenome_package.pdf Basically you need to create an object of class 'genome', and afterwards calculate what you need. I am unwilling to add that as an answer as it could be much more problematic. I am not sure whether your current `df` is a valid input for the `genome` class though. – erasmortg Jul 30 '15 at 18:33
  • Thank you @erasmortg! I will have it a look anyway! – Silvia Jul 30 '15 at 18:46

0 Answers0