I have a dataframe made up of 3 continuous response variables and 2 categorical predictor variables. I have been modelling each response variable separately, but using the same predictor variables. I would like to make 3 barcharts with the same x axis but for each response variable. It would be nice to get the formatting of something like facet_wrap
since each graph then wouldn't need its own x-axis. I've attached some sample data, and some code to show one of the graphs I produced.
y1<-sample(1:150, 100, replace=T)
y2<-sample(1:150, 100, replace=T)
y3<-sample(1:150, 100, replace=T)
x1<-sample(x=c("Site1", "Site2"), size=100, replace=T, prob=rep(1/2,2))
x2<-sample(x=c("A", "B", "C", "D"), size=100, replace=T, prob=rep(1/4,4))
df<-data.frame(y1,y2,y3,x1,x2)
ggplot(df, aes(x=x2, y=y1, fill=x1))
y1sum<-summarySE(df, measurevar="y1", groupvars=c("x1", "x2"))
ggplot(y1sum, aes(x=x2, y=y1, fill=x1)) + geom_bar(position=position_dodge(),
stat="identity") + geom_errorbar(aes(ymin=y1-ci, ymax=y1+ci), width=.2,
position=position_dodge(.9))
So I'd like to get the above graph, but for each response variable and stacked on top of each other.
As an aside, I'd also appreciate some guidance on how to add some letters above each set of bars to show which are significantly different.
The summarySE function is based off the code from here http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/
summarySE <- function(data=NULL, measurevar, groupvars=NULL, na.rm=FALSE,
conf.interval=.95, .drop=TRUE) {
library(plyr)
# New version of length which can handle NA's: if na.rm==T, don't count them
length2 <- function (x, na.rm=FALSE) {
if (na.rm) sum(!is.na(x))
else length(x)
}
# This does the summary. For each group's data frame, return a vector with
# N, mean, and sd
datac <- ddply(data, groupvars, .drop=.drop,
.fun = function(xx, col) {
c(N = length2(xx[[col]], na.rm=na.rm),
mean = mean (xx[[col]], na.rm=na.rm),
sd = sd (xx[[col]], na.rm=na.rm)
)
},
measurevar
)
# Rename the "mean" column
datac <- rename(datac, c("mean" = measurevar))
datac$se <- datac$sd / sqrt(datac$N) # Calculate standard error of the mean
# Confidence interval multiplier for standard error
# Calculate t-statistic for confidence interval:
# e.g., if conf.interval is .95, use .975 (above/below), and use df=N-1
ciMult <- qt(conf.interval/2 + .5, datac$N-1)
datac$ci <- datac$se * ciMult
return(datac)
}
Thanks in advance to anyone who can offer advice.