2

I am trying to make a stacked bar chart and I would like to reorder the variables on the x-axis based on the data from a single category. In the example below there are three x values each of which have values corresponding to three categories. How would it be possible to plot the graph while sorting the name values for increasing abundance of "bb".

Although this question is similar to other questions about reordering categorical variables the difference here is that the ordering is based on a subset of one column's data. Any suggestions appreciated.

#create the dataframe
name = c('a', 'a', 'a', 'b', 'b', 'b','c','c','c') 
cat = c("aa", "bb", "cc", "aa", "bb", "cc","aa", "bb", "cc") 
percent = c( 5 , 5, 90, 40, 40 , 20, 90,5,5) 
df = data.frame(name, cat, percent)

#stacked barchart with default ordering   
ggplot(df, aes(x=name,y=percent, fill=cat)) + geom_bar(position="fill")

#I'm looking to reorder the x-axis by the `percent` values for category "bb"
vals = df[ df$cat == 'bb', ]                     #subset
xvals = vals[with(vals, order(percent)), ]$name  #get values
ggplot(df, aes(x =reorder(name, xvals ), y = percent, fill=cat])) + geom_bar(position="fill") #order with new values
Community
  • 1
  • 1
zach
  • 29,475
  • 16
  • 67
  • 88

3 Answers3

2
df$name2 <- factor(df$name, levels = xvals)
ggplot(df, aes(x = name2, y = percent, fill = cat)) +
  geom_bar(stat = "identity", position = "fill")

enter image description here

Henrik
  • 65,555
  • 14
  • 143
  • 159
1

There are two problems here. The first is that you want to resort based on the percent in bb. The second is tha ggplot always sorts a categorical x-axis alphabetically, so you need to get around that.

First, to resort your data, ironically you need to transform to wide format, sort, and then re-transform to long format:

zz <- dcast(df,name~cat)         # columns for aa, bb, cc
yy <- zz[order(zz$bb),]          # order by bb
yy <- cbind(id=1:nrow(yy),yy)    # add an id column; will need later
gg <- melt(yy,id.vars=c("id","name"),variable.name="cat",value.name="percent")

Then:

ggplot(gg, aes(x=factor(id),y=percent, fill=cat))+
  geom_bar()+
  scale_x_discrete(labels=gg$name)+
  labs(x="name")

Produces this:

jlhoward
  • 58,004
  • 7
  • 97
  • 140
  • This is a clean and elegant answer. – zach Dec 05 '13 at 22:47
  • Thanks, but @Henrik's answer is much better. When `x` is set to a factor, `ggplot` orders it based on the factor levels, which are usually alphabetical in the factor labels. @Henrik created a new factor, `name2`, which has levels ordered based on OP's `xval`, then used that factor for x. This is a much cleaned way to do it. – jlhoward Dec 05 '13 at 23:36
0

like this...?

df$name2 = factor(df$name, levels = levels(df$name), labels = xvals)
ggplot(df, aes(x = name2, y = percent, fill=cat)) + geom_bar(position="fill") 
sparrow
  • 1,075
  • 1
  • 10
  • 17
  • thanks for having a go @sparrow but the resulting df doesn't look right to me. `name2` has different values than `name` and the resulting plot doesn't reorder along 'bb' – zach Dec 05 '13 at 22:18