3

Let's say I'm working with proportions, I have two main variables (sex and pain_level). It's not difficult to plot them:

Plot 1 - proportions

With tidyverse and broom (and thanks for this link here: Calling prop.test function in R with dplyr) I can compare if the proportions are statistically different.

Proportion test

Now comes the question!

I want to add to the plot, the error bar. I know it's not as difficult as I'm thinking, but I could not find a way to do it. I've tried to replicate this link here (http://www.andrew.cmu.edu/user/achoulde/94842/labs/lab07_solution.html) but I'm trying to stay at tidyverse environment.

The desired output should be something like that: Desired output

Please feel free to use the script/syntax below that simulate the original dataset.

library(tidyverse)
ds <- data.frame(sex = rep(c("M","F"), 18),
                 pain_level = c("High","Moderate","low"))

#plot
ds %>% 
  group_by(pain_level, sex) %>% 
  summarise(n=n()) %>% 
  mutate(prop = n/sum(n)*100) %>% 
  ggplot(., aes(x = sex, fill = pain_level, y = prop)) +
  geom_bar(stat = "summary") +
  facet_wrap( ~ pain_level) +
  theme(legend.position = "none")

#p values of proportion test

ds %>% 
  rowwise %>%
  group_by(pain_level, sex) %>% 
  summarise(cases = n()) %>% 
  mutate(pop = sum(cases)) %>% #compute totals
  distinct(., pain_level, .keep_all= TRUE) %>% #keep only one value of the row 
  mutate(tst = list(broom::tidy(prop.test(cases, pop, conf.level=0.95)))) %>%
  tidyr::unnest(tst)
Luis
  • 1,388
  • 10
  • 30
  • I can't get your code to run without errors. `tipo` is a variable that is missing. Generally, `geom_errorbar()` would do the trick with `ymin = conf.low` and `ymax = conf.high` from the `prop.test`s. – teunbrand Aug 03 '19 at 23:31
  • Sorry, my bad. The variable name is pain_level (instead of tipo). Could you please try to emulate it now ? I've tried ymin and max but did not have success. Thanks – Luis Aug 04 '19 at 13:36

1 Answers1

4

I think the following might roughly resemble your desired output:

ds %>% 
  group_by(pain_level, sex) %>% 
  summarise(cases = n()) %>% 
  mutate(pop = sum(cases)) %>%
  rowwise() %>%
  mutate(tst = list(broom::tidy(prop.test(cases, pop, conf.level=0.95)))) %>%
  tidyr::unnest(tst) %>%
  ggplot(aes(sex, estimate, group = pain_level)) +
  geom_col(aes(fill = pain_level)) +
  geom_errorbar(aes(ymin = conf.low, ymax = conf.high)) +
  facet_wrap(~ pain_level)

enter image description here

teunbrand
  • 33,645
  • 4
  • 37
  • 63
  • Wow! Amazing. I didn't know it was simple as that. I always see the "conf.low" and "conf.high" as results that we need to compute before plotting. Good to know it's automatically computed by ggplot. – Luis Aug 05 '19 at 13:24
  • 1
    They are not computed by ggplot but by the `prop.test()` function that is called just before handing the data to ggplot. – teunbrand Aug 05 '19 at 14:44
  • ahhh, okay! Now it is clear to me! Thanks again, I upvoted and accepted your answer. – Luis Aug 05 '19 at 15:55