6

I have a dataframe with shares in percent,columns representing different items, rows the respective share of interviewees answering in different categories. I want to produce a stacked barchart.

library(ggplot2)
library(reshape2)
 test<-data.frame(i1=c(16,40,26,18),
               i2=c(17,46,27,10),
               i3=c(23,43,24,10),
               i4=c(19,25,20,36))
 rownames(test)<-c("very i.","i.","less i.","not i.")

test.m<-melt(test)

ggplot(test.m, aes(x=variable, y=value, fill=value)) + 
   geom_bar(position="stack", stat="identity")

Looks o.k., but I want
a) center the bars: positive answers (very i. and i) up and the bottom two classes (less i. and not i.) down.
b) each category (very i, i, less i, not i,) having the same colour.

Any help would be much appreciated.

Fritzbrause
  • 87
  • 2
  • 7

2 Answers2

12

It is better to use the category names as a separator instead of row names:

test$category <- factor(c(3,4,2,1), labels=c("very i.","i.","less i.","not i."))

(The ordering of the factor levels is done with repect to the stacked barplot (lowest: not i., highest: very i.).

test.m <- melt(test)

To answer your questions:

  1. Stacked barplots do not work well if some values are above and others are below zero. Hence, two separate barplots are created (one with negative values, one with positive values).
  2. The new column category is used for the fill parameter to map each category to a different colour.

The complete code:

ggplot(test.m, aes(x=variable, fill=category)) + 
      geom_bar(data = subset(test.m, category %in% c("less i.","not i.")),
               aes(y = -value), position="stack", stat="identity") +
      geom_bar(data = subset(test.m, !category %in% c("less i.","not i.")), 
               aes(y = value), position="stack", stat="identity")

enter image description here

Sven Hohenstein
  • 80,497
  • 17
  • 145
  • 168
  • @sven-hohenstein I can't seem to get this to behave for me. Please see [my example](https://gist.github.com/geotheory/21f65359458743bb2787) which gets the +ve/-ve y categories mixed up! Any idea why? – geotheory Jan 19 '15 at 15:42
  • @geotheory You should reverse the vector of color values (`scale_fill_manual(values=rev(colorRampPalette(c('red','grey','green'))(4)))`). – Sven Hohenstein Jan 19 '15 at 16:20
  • I'm not sure mine are still mixed up, just with the colour scheme reversed: http://gyazo.com/299436e3f68a77b431812dca8e175eff – geotheory Jan 19 '15 at 16:27
  • @geotheory Do you want to change the order of the factor levels? `levels(d$rating) <- c('bad','very_bad','good','very_good')` – Sven Hohenstein Jan 19 '15 at 16:32
10

Another tool that is designed exactly for this purpose is likert() in the HH package. This sweet function plots diverging stacked barcharts appropriate for Likert, semantic differential, and rating scale data.

library(HH)
# note use of t(test)[,4:1] to transpose and mirror dataframe for easy plotting
# test dataframe is otherwise unaltered from OP's question

likert(t(test)[,4:1], horizontal = FALSE,
       main = NULL, # or give "title",
       xlab = "Percent", # becomes ylab due to horizontal arg
       auto.key = list(space = "right", columns = 1,
                     reverse = TRUE))

enter image description here

One particularly appealing feature of likert() is the ability to center a neutral response with the ReferenceZero argument. (Notice how it uses an appropriate grey color for the reference response):

likert(t(test)[,4:1], horizontal=FALSE,
       main = NULL, # or give "title",
       xlab = "Percent", # becomes ylab due to horizontal arg
       ReferenceZero = 3,
       auto.key=list(space = "right", columns = 1,
                     reverse = TRUE))

likert data centered on one response

(These examples use vertical bars as is common, but horizontal=TRUE is often better, especially if one wants to include question or scale names.)

Community
  • 1
  • 1
MattBagg
  • 10,268
  • 3
  • 40
  • 47