I have a dataframe that looks like this:
df <- data.frame("Experiment" = c(rep("Exp1", 6), rep("Exp2", 5), rep("Exp3", 4)),
"Replicate" = c("A","A","A","B","C","C","A","A","B","B","C","A","B","B","C"),
"Type" = c("alpha","beta","gamma","alpha","alpha","beta","alpha","gamma","beta","gamma","beta","alpha","alpha","gamma","beta"),
"Frequency" = c(10,100,1000,15,5,105,10,1010,95,1020,105,15,10,990,100))
I'm trying to calculate mean and stdev of Frequency
for combination of Experiment
and Type
, and I first tried it by running this line:
df %>% group_by(Experiment, Type) %>% summarise(mean = mean(Frequency), sd = sd(Frequency)
If I run this, I get a tibble that looks like below:
Experiment Type mean sd
Exp1 alpha 10 5
Exp1 beta 102. 3.54
Epx1 gamma 1000 NA
But I'd like R to think that all Type
(alpha
, beta
, gamma
) should exist for every combination of Experiment
and Replicate
, so that if there is no Frequency
value for Type
, R will use 0
instead of not including that value.
In other words, what I want needs to be calculated like below:
Experiment Type mean sd
Exp1 alpha mean(10,15,5) sd(10,15,5)
Exp1 beta mean(100,0,105) sd(100,0,105)
Exp1 gamma mean(1000,0,0) sd(1000,0,0)
For example, for Exp1
beta
, the summarise
function I used above calculates mean(100,105)
and sd(100,105)
because Exp1
Replicate B
doesn't exist in my df
. But I want R to calculate mean(100,0,105)
and sd(100,0,105)
instead. Would anyone be able to give me some ideas on how to do this?