1

I am trying to normalize a stacked plot and put the percentage of each fill in the middle of each fill. This is an easy question, but I don't have a handle on what the y-value is, since it is implicit, so I am not sure how to set: geom_text(aes(y=??, label=paste0(??,"%"))).

As for normalizing, I have seen people collapse and normalize dt beforehand, but I was hoping to learn a ggplot-ish way to do it.

Do I just need to convert my dt to a summary table of time-grpmember percentages? That would give me a direct handle on y-values (e.g. to compute 'middle of each fill').

dt <- as.data.table(read.table(text = "time grpmember
1 TRUE
1 TRUE
1 TRUE
1 FALSE
1 FALSE
1 FALSE
2 FALSE
2 TRUE
2 TRUE
2 TRUE", header=TRUE))
dt$time <- as.factor(dt$time)

ggplot(dt, aes(x=time)) + geom_bar(aes(fill=grpmember)) + geom_text(aes(y = 2, label=paste0("?", "%")))

Graph

rjturn
  • 323
  • 2
  • 8
  • Could you provide the original data of your counts ? If I understand correctly, you would like the y-axis expressed in original count values while also having percentages for every stack displayed? – Ruthger Righart Apr 15 '15 at 13:33
  • If you want a 100% stacked bar chart, where each of the two series' percentages sum to 100%, ggplot2 can do that w/o needing to transform the data: http://stackoverflow.com/questions/3619067/stacked-bar-chart-in-r-ggplot2-with-y-axis-and-bars-as-percentage-of-counts But as far as percentage labels with such a chart, you might need to convert to a summary table as you propose and calculate the positions of the labels that way. – Sam Firke Apr 15 '15 at 13:56

1 Answers1

1

Edit

From about ggplot 2.1.0, geom_text gets a position_fill / position_stack, and thus there is no longer a need to calculate nor use a y aesthetic to position the labels (see pos in original below).

dt <- read.table(text = "time grpmember
1 TRUE
1 TRUE
1 TRUE
1 FALSE
1 FALSE
1 FALSE
2 FALSE
2 TRUE
2 TRUE
2 TRUE", header=TRUE)
dt$time <- as.factor(dt$time)

library(ggplot2)
library(dplyr)

dtSummary = dt %>%
   group_by(time, grpmember) %>%
   summarise(count = n()) %>%
   mutate(Percent = paste0(sprintf("%.1f", count / sum(count) * 100), "%")) 

ggplot(dtSummary, aes(x = time, y = count, fill = grpmember, label = Percent)) + 
   geom_bar(position = "stack", stat = "identity") + 
   geom_text(position = position_stack(vjust = .5))



Original

I'm not exactly sure what you are after, but this calculates a summary table containing counts, labels (i.e., percents) and positions for the labels (at the midpoint of each segment); then draws the plot.

dt <- read.table(text = "time grpmember
1 TRUE
1 TRUE
1 TRUE
1 FALSE
1 FALSE
1 FALSE
2 FALSE
2 TRUE
2 TRUE
2 TRUE", header=TRUE)
dt$time <- as.factor(dt$time)

library(ggplot2)
library(dplyr)

dtSummary = dt %>%
 # Get the counts
   group_by(time, grpmember) %>%
   summarise(count = n()) %>%
# Get labels and position of labels
   group_by(time) %>%
   mutate(Percent = paste0(sprintf("%.1f", count / sum(count) * 100), "%")) %>%
   mutate(pos = cumsum(count) - 0.5 * count)%>%
   mutate(pos = cumsum(count) - 0.5 * count)  %>%
   mutate(grpmember = factor(grpmember),
          grpmember = factor(grpmember, levels = rev(levels(grpmember))))

ggplot(dtSummary, aes(x = time, y = count, fill = grpmember)) + 
   geom_bar(stat = "identity") + 
   geom_text(aes(y = pos, label = Percent))

enter image description here

Sandy Muspratt
  • 31,719
  • 12
  • 116
  • 122