1

I'd like to add percentage labels per gear to the bars but keep the count y-scale. E.g. 10% of all 'gear 3' are '4 cyl'

library(ggplot)

ds <- mtcars
ds$gear <- as.factor(ds$gear)

p1 <- ggplot(ds, aes(gear, fill=gear)) +
  geom_bar() +
  facet_grid(cols = vars(cyl), margins=T) 

p1

enter image description here

Ideally only in ggplot, wihtout adding dplyr or tidy. I found some of these solutions but then I get other issues with my original data.

EDIT: Suggestions that this is a duplicate from: enter link description here

I saw this also earlier, but wasn't able to integrate that code into what I want:

# i just copy paste some of the code bits and try to reconstruct what I had earlier
ggplot(ds, aes(gear, fill=gear)) +
  facet_grid(cols = vars(cyl), margins=T) +       
  # ..prop.. meaning %, but i want to keep the y-axis as count
  geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count") +  
  # not sure why, but I only get 100% 
  geom_text(aes( label = scales::percent(..prop..),
             y= ..prop.. ), stat= "count", vjust = -.5)

enter image description here

1 Answers1

0

The issue is that ggplot doesn't know that each facet is one group. This very useful tutorial helps with a nice solution. Just add aes(group = 1)

P.S. At the beginning, I was often quite reluctant and feared myself to manipulate my data and pre-calculate data frames for plotting. But there is no need to fret! It is actually often much easier (and safer!) to first shape / aggregate your data into the right form and then plot/ analyse the new data.

library(tidyverse)
library(scales)

ds <- mtcars
ds$gear <- as.factor(ds$gear)

First solution:

ggplot(ds, aes(gear, fill = gear)) +
  geom_bar() +
  facet_grid(cols = vars(cyl), margins = T) +
  geom_text(aes(label = scales::percent(..prop..), group = 1), stat= "count")

edit to reply to comment

Showing percentages across facets is quite confusing to the reader of the figure and I would probably recommend against such a visualization. You won't get around data manipulation here. The challenge is here to include your "facet margin". I create two summary data frames and bind them together.


ds_count <- 
  ds %>% 
  count(cyl, gear) %>% 
  group_by(gear) %>% 
  mutate(perc = n/sum(n)) %>% 
  ungroup %>% 
  mutate(cyl = as.character(cyl))

ds_all <- 
  ds %>% 
  count(cyl, gear) %>% 
  group_by(gear) %>% 
  summarise(n = sum(n)) %>% 
  mutate(cyl = 'all', perc = 1)

ds_new <- bind_rows(ds_count, ds_all)

ggplot(ds_new, aes(gear, fill = gear)) +
    geom_col(aes(gear, n, fill = gear)) +
    facet_grid(cols = vars(cyl)) +
    geom_text(aes(label = scales::percent(perc)), stat= "count")

IMO, a better way would be to simply swap x and facetting variables. Then you can use ggplots summarising function as above.

ggplot(ds, aes(as.character(cyl), fill = gear)) +
  geom_bar() +
  facet_grid(cols = vars(gear), margins = T) +
  geom_text(aes(label = scales::percent(..prop..), group = 1), stat= "count")

Created on 2020-02-07 by the reprex package (v0.3.0)

tjebo
  • 21,977
  • 7
  • 58
  • 94
  • Nice and thanks, but unfortunately not yet what I am looking for. Now when you sum e.g. the percent of red bars: 9+29+86 you get more than 100%. Essentially I'd like to display how many % per gear are in one cyl, so that when summing up all same gear you get 100%.' – Philipp Staudacher Feb 07 '20 at 15:38