New to R. I'm trying to calculate the mean of double plays hit into for each batter using a data set from 2006 - 2016. But the code is flawed and I'm not sure why. The Rate1 is the same for each batter. Once I get Rate1 for each batter I want an overall mean and stdev, but I haven't gotten to that point yet...
This is a subset of the data frame...
BAT_ID DP_FL
2 hanim001 FALSE
18 hereg002 FALSE
40 pujoa001 TRUE
50 espid001 TRUE
97 troum001 FALSE
131 calhk001 FALSE
136 hanim001 FALSE
148 hanim001 FALSE
165 mottt001 FALSE
215 calhk001 TRUE
238 calhk001 FALSE
255 napom001 FALSE
264 gomec002 FALSE
267 maybc001 TRUE
271 napom001 FALSE
279 rua-r001 FALSE
283 simma001 TRUE
286 mazan001 FALSE
318 martj007 FALSE
322 choos001 TRUE
356 gomec002 FALSE
#Percent groundball double play
library(plyr)
mean1<-ddply(all_data_gnd, .(BAT_ID), summarize, Rate1=
(sum(as.numeric(which(all_data_gnd$DP_FL==1))) /
(sum(as.numeric(which(all_data_gnd$DP_FL==0))) +
sum(as.numeric(which(all_data_gnd$DP_FL==1))))))
head(mean1)
> head(mean1)
BAT_ID Rate1
1 abrej003 0.1741862
2 adamc001 0.1741862
3 adaml001 0.1741862
4 adamm002 0.1741862
5 adduj002 0.1741862
6 adlet001 0.1741862