Standardizing sample size for re-estimation of Fst

Question

I'm a newbie to R, so please be patient.

I have 4 populations of one species(pop in 1 column) , with diferent sample sizes.Having analysed mitDNA haplotypes (hap in another column) for each individual (each row) I now need to standardize the sample size for all populations and estimate FST. I supose this may simply be done with some resampling method with replicates in order to obtain a random subsample within populations, but I'm not sure what's the proper way to do it.

Also, I have never estimated Fst in R. Is there a way to obtain the output in a table similar to the one below?Or else directly apply the results of the subsampling in a package in R that may estimate Fst after this procedure?

df <- read.table(text="pop  hap
1   A
1   A
1   B
1   B
1   B
1   C
1   D
2   F
2   F
2   A
2   A
2   B
2   E
3   A
3   A
3   B
3   D
4   A
4   A
4   A
4   B
4   B", header=TRUE)

Hope I made myself clear.

Thank you very much, in advance.

I don't know how the table you pasted in your comment is supposed to look like. Please edit your question accordingly. As for the calculation, perhaps the package `PopGenome` could help you: https://cran.r-project.org/web/packages/PopGenome/PopGenome.pdf — erasmortg, Jul 30 '15 at 18:12
Sorry erasmortg ! You're right. It would be similar to this (a simple example): pop hap 1 A 1 B 1 B 1 C 1 D 2 F 2 F 2 A 2 A 2 B 2 E 3 A 3 B 3 D 4 A 4 A 4 B 4 B is actually similar to the input file. Just corrected in the main question but it seems that is not keeping the style...Is one column with pop, one column with the correspondent randomly susampled haplotypes within population and each row is an individual. — Silvia, Jul 30 '15 at 18:14
Reading into the package, this is way outside my area of expertise, but please take a look at this: https://cran.r-project.org/web/packages/PopGenome/vignettes/An_introduction_to_the_PopGenome_package.pdf Basically you need to create an object of class 'genome', and afterwards calculate what you need. I am unwilling to add that as an answer as it could be much more problematic. I am not sure whether your current `df` is a valid input for the `genome` class though. — erasmortg, Jul 30 '15 at 18:33

Standardizing sample size for re-estimation of Fst

0 Answers0