ggplot2 - stacking graphs with different response variables but same x variable

Question

I have a dataframe made up of 3 continuous response variables and 2 categorical predictor variables. I have been modelling each response variable separately, but using the same predictor variables. I would like to make 3 barcharts with the same x axis but for each response variable. It would be nice to get the formatting of something like facet_wrap since each graph then wouldn't need its own x-axis. I've attached some sample data, and some code to show one of the graphs I produced.

y1<-sample(1:150, 100, replace=T)
y2<-sample(1:150, 100, replace=T)
y3<-sample(1:150, 100, replace=T)
x1<-sample(x=c("Site1", "Site2"), size=100, replace=T, prob=rep(1/2,2))
x2<-sample(x=c("A", "B", "C", "D"), size=100, replace=T, prob=rep(1/4,4))

df<-data.frame(y1,y2,y3,x1,x2)

ggplot(df, aes(x=x2, y=y1, fill=x1))

y1sum<-summarySE(df, measurevar="y1", groupvars=c("x1", "x2"))


ggplot(y1sum, aes(x=x2, y=y1, fill=x1)) + geom_bar(position=position_dodge(),     
stat="identity") + geom_errorbar(aes(ymin=y1-ci, ymax=y1+ci), width=.2, 
position=position_dodge(.9))

So I'd like to get the above graph, but for each response variable and stacked on top of each other.

As an aside, I'd also appreciate some guidance on how to add some letters above each set of bars to show which are significantly different.

The summarySE function is based off the code from here http://www.cookbook-r.com/Graphs/Plotting_means_and_error_bars_(ggplot2)/

summarySE <- function(data=NULL, measurevar, groupvars=NULL, na.rm=FALSE,
                      conf.interval=.95, .drop=TRUE) {
    library(plyr)

    # New version of length which can handle NA's: if na.rm==T, don't count them
    length2 <- function (x, na.rm=FALSE) {
        if (na.rm) sum(!is.na(x))
        else       length(x)
    }

    # This does the summary. For each group's data frame, return a vector with
    # N, mean, and sd
    datac <- ddply(data, groupvars, .drop=.drop,
      .fun = function(xx, col) {
        c(N    = length2(xx[[col]], na.rm=na.rm),
          mean = mean   (xx[[col]], na.rm=na.rm),
          sd   = sd     (xx[[col]], na.rm=na.rm)
        )
      },
      measurevar
    )

    # Rename the "mean" column    
    datac <- rename(datac, c("mean" = measurevar))

    datac$se <- datac$sd / sqrt(datac$N)  # Calculate standard error of the mean

    # Confidence interval multiplier for standard error
    # Calculate t-statistic for confidence interval: 
    # e.g., if conf.interval is .95, use .975 (above/below), and use df=N-1
    ciMult <- qt(conf.interval/2 + .5, datac$N-1)
    datac$ci <- datac$se * ciMult

    return(datac)
}

Thanks in advance to anyone who can offer advice.

score 1 · Accepted Answer · answered Sep 14 '16 at 08:55

I have used dplyr instead of the summarySE function you used

library(dplyr)
test <- df %>% gather(., key="var", value="value", -x1, -x2) %>%
  group_by(x1,x2,var) %>% summarise(N=n(), 
                                    Mean = mean(value), 
                                    sd= sd(value),
                                    se = sd/sqrt(N),
                                    ci = qnorm(0.975)*se) %>% ungroup

The below code creates a single column of bar plots coloured by site, and faceted by variable.

test %>%  ggplot(., aes(x=x2, y=Mean, fill=x1)) +
  geom_bar(position=position_dodge(), stat="identity") + 
  geom_errorbar(aes(ymin=Mean-ci, ymax=Mean+ci), width=.2,position=position_dodge(.9)) +
  facet_wrap(~var, ncol = 1)

It may be worth considering using a box plot as they often convey more information about the datset than bar plots.

df %>% gather(., key="var", value="value", -x1, -x2) %>% 
  ggplot(., aes(x=x2, y=value, fill=x1)) +geom_boxplot() +
  facet_wrap(~var, ncol = 1)

Cheers that worked perfectly. And thanks for the code for boxplots as well — Conor Neilson, Sep 14 '16 at 09:02

ggplot2 - stacking graphs with different response variables but same x variable

1 Answers1