I want to calculate the pooled (actually weighted) standard deviation for all the unique sites in my data frame.
The values for these sites are values for single species forest stands and I want to pool the mean and the sd so that I can compare broadleaved stands with conifer stands.
This is the data frame (df) with values for the broadleaved stands:
keybl n mean sd
Vest02DenmDesp 3 58.16 6.16
Vest02DenmDesp 5 54.45 7.85
Vest02DenmDesp 3 51.34 1.71
Vest02DenmDesp 3 59.57 5.11
Vest02DenmDesp 5 62.89 10.26
Vest02DenmDesp 3 77.33 2.14
Mato10GermDesp 4 41.89 12.6
Mato10GermDesp 4 11.92 1.8
Wawa07ChinDesp 18 0.097 0.004
Chen12ChinDesp 3 41.93 1.12
Hans11SwedDesp 2 1406.2 679.46
Hans11SwedDesp 2 1156.2 464.07
Hans11SwedDesp 2 4945.3 364.58
Keybl is the code for the site. The formula for the pooled SD is:
s=sqrt((n1-1)*s1^2+(n2-1)*s2^2)/(n1+n2-2))
(Sorry I can't post pictures and did not find a link that would directly go to the formula)
Where 2 is the number of groups and therefore will change depending on site. I know this is used for t-test and two groups one wants to compare. In this case I'm not planning to compare these groups. My professor suggested me to use this formula to get a weighted sd. I didn't find a R function that incorporates this formula in the way I need it, therefore I tried to build my own. I am, however, new to R and not very good at making functions and loops, therefore I hope for your help.
This is what I got so far:
sd=function (data) {
nc1=data[z,"nc"]
sc1=data[z, "sc"]
nc2=data[z+1, "nc"]
sc2=data[z+1, "sc"]
sd1=(nc1-1)*sc1^2 + (nc2-1)*sc2^2
sd2=sd1/(nc1+nc2-length(nc1))
sqrt(sd2)
}
splitdf=split(df, with(df, df$keybl), drop = TRUE)
for (c in 1:length(splitdf)) {
for (i in 1:length(splitdf[[i]])) {
a = (splitdf[[i]])
b =sd(a)
}
}
1) The function itself is not correct as it gives slightly lower values than it should and I don't understand why. Could it be that it does not stop when z+1 has reached the last row? If so, how can that be corrected?
2) The loop is totally wrong but it is what I could come up with after several hours of no success.
Can anybody help me?
Thanks,
Antra