How to add 95% confidence intervals to graph of proportions of factor levels in ggplot?

Question

I wanted to build on the great answer I got to a previously asked question:

Graph proportion within a factor level rather than a count in ggplot2

I was hoping to build on the code:

var1 <- c("Left", "Right", NA, "Left", "Right", "Right", "Right", "Left", "Left", "Right", "Left", "Left","Left", "Right", "Left", "Right", "Right", "Right", "Left", "Left", "Right", NA, "Left", "Left","Left", "Right", NA, "Left", "Right", "Right", "Right", "Left", "Left", "Right", "Left", "Left","Left", "Right", "Left", "Right", "Right", "Right", "Left", "Left", "Right", NA, "Left", "Left")
var2 <- c("Higher", "Lower", NA, "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly higher", "Higher", "Higher", "Higher", "Slightly higher","Higher", "Lower", "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly higher", "Higher", "Higher", "Higher", NA, "Slightly lower","Higher", "Lower", NA, "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly higher", "Higher", "Higher", "Higher", "Slightly higher","Higher", "Lower", "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly lower", "Higher", "Higher", "Higher", NA, "Slightly lower")
df <- as.data.frame(cbind(var1, var2))

library(dplyr)
library(ggplot2)

df %>%
  na.omit() %>%
  group_by(var1, var2) %>%
  summarise(n = n()) %>%
  mutate(n = n/sum(n)) %>%
  ungroup() %>%
  ggplot() + aes(var2, n, fill = var1) + 
  geom_bar(position = "dodge", stat = "identity") + 
  labs(x="Left or Right",y="Count")+
  scale_y_continuous() +
  scale_fill_discrete(name = "Answer:")+ theme_classic()+ 
  theme(legend.position="top")  +
  scale_fill_manual(values = c("black", "red"))

To add error bars in the form of 95% confidence intervals to each bar on my graph. I have tried to add in the term

upperE=(1.96*sqrt(n/sum(n))*(1-(n/sum(n)))/n), lowerE=(-1.96*sqrt(n/sum(n))*(1-(n/sum(n)))/n).

But alas I keep getting errors...

I also tried making an entirely new dataframe for the graph, thus:

var1 <- c("Left", "Right", NA, "Left", "Right", "Right", "Right", "Left", "Left", "Right", "Left", "Left","Left", "Right", "Left", "Right", "Right", "Right", "Left", "Left", "Right", NA, "Left", "Left","Left", "Right", NA, "Left", "Right", "Right", "Right", "Left", "Left", "Right", "Left", "Left","Left", "Right", "Left", "Right", "Right", "Right", "Left", "Left", "Right", NA, "Left", "Left")
var2 <- c("Higher", "Lower", NA, "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly higher", "Higher", "Higher", "Higher", "Slightly higher","Higher", "Lower", "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly higher", "Higher", "Higher", "Higher", NA, "Slightly lower","Higher", "Lower", NA, "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly higher", "Higher", "Higher", "Higher", "Slightly higher","Higher", "Lower", "Slightly higher", "Slightly higher", "Slightly higher", "Lower", "Slightly lower", "Higher", "Higher", "Higher", NA, "Slightly lower")
df <- as.data.frame(cbind(var1, var2))



dat <- df %>%
  na.omit() %>%
  group_by(var1, var2) %>%
  summarise(n = n()) %>%
  mutate(prop = n/sum(n),upperE=1.96*sqrt(n/sum(n))*(1-(n/sum(n)))/n, lowerE=-1.96*sqrt(n/sum(n))*(1-(n/sum(n)))/n)

test <- ggplot(dat, aes(x=var2, y = prop, fill = var1))+ 
  geom_bar(position = "dodge", stat = "identity") + geom_errorbar(aes(ymin = lowerE, ymax = upperE),position="dodge")+
  labs(x="Answer",y="Proportion")+
  scale_fill_discrete(name = "Condition:")+ theme_classic()+ 
  theme(legend.position="top")

Which gives me error bars but positioned at 0 on the Y-axis not on top of each bar...

Does anyone have any suggestions? Thank you!

Sarah · Accepted Answer · 2019-10-11T15:00:09.293

1

I have now worked out how to get the error bars to sit at the appropriate position on each bar - I needed to associate the ymin and ymax specification of the error bar with the values being plotted, thus:

dat <- df %>%
  na.omit() %>%
  group_by(var1, var2) %>%
  summarise(n = n()) %>%
  mutate(prop = n/sum(n),upperE=1.96*sqrt(n/sum(n))*(1-(n/sum(n)))/n, lowerE=-1.96*sqrt(n/sum(n))*(1-(n/sum(n)))/n)

test <- ggplot(dat, aes(x=var2, y = prop, fill = var1))+ 
  geom_bar(position = "dodge", stat = "identity") + geom_errorbar(aes(ymin = prop+lowerE, ymax = prop+upperE),width = .2, position=position_dodge(.9))+
  labs(x="Answer",y="Proportion")+
  scale_fill_discrete(name = "Condition:")+ theme_classic()+ 
  theme(legend.position="top")

Which gave:

edited Oct 11 '19 at 15:00

answered Oct 11 '19 at 14:51

Sarah

789
3
12
29

1

Are you sure its not `aes(ymin = prop + lowerE, ymax = prop + upperE)`? – Jonny Phelps Oct 11 '19 at 14:52
Hi! Thanks, both would be correct but your way reads much more intuitively - I will change my script. Thank you :) – Sarah Oct 11 '19 at 14:59
No worries Sarah :) – Jonny Phelps Oct 11 '19 at 15:01

score 0 · Answer 2 · edited Nov 30 '20 at 09:32

0

The formula for the SE in the 95%CI in proportions is: se = sqrt((p * (1-p))/n. So I think in the solution above it is stated: sqrt(n/sum(n) * 1-(n/sum(n))/n). However, n there is only the count of successes. The full sample is sum(n). So it actually should be sqrt(n/sum(n) * (1-(n/sum(n))/**sum**(n)).

edited Nov 30 '20 at 09:32

Timus

10,974
5
14
28

answered Nov 30 '20 at 08:31

4352sdf

1
1

score 0 · Answer 3 · answered Nov 15 '22 at 22:51

Super old thread, but just in case somebody still stumbles upon this: the formula for the confidence intervals in the upvoted answer is incorrect.

It should be:

mutate(prop = n/sum(n),
         upperE=1.96*sqrt(n/sum(n)*(1-(n/sum(n)))/sum(n)), 
         lowerE=-1.96*sqrt(n/sum(n)*(1-(n/sum(n)))/sum(n)))

. With the formula that you used for the confidence intervals, you only take the square root of the first bit of the formula. However, you need to take the square root of the entire formula (except for the Z score).

How to add 95% confidence intervals to graph of proportions of factor levels in ggplot?

3 Answers3

Linked