3

I have two columns in a data.frame, that should have levels sorted in the same order, but I don't know how to do it in a straightforward manner.

Here's the situation:

library(ggplot2)
library(dplyr)
library(magrittr)
set.seed(1)
df1 <- data.frame(rating = sample(c("GOOD","BAD","AVERAGE"),10,T),
                  div = sample(c("A","B","C"),10,T),
                  n = sample(100,10,T))

# I'm adding a label column that I use for plotting purposes
df1 <- df1 %>% group_by(rating) %>% mutate(label = paste0(rating," (",sum(n),")")) %>% ungroup
# # A tibble: 10 x 4
#     rating    div     n         label
#     <fctr> <fctr> <int>         <chr>
#  1     BAD      C    48     BAD (220)
#  2     BAD      B    87     BAD (220)
#  3     BAD      C    44     BAD (220)
#  4    GOOD      B    25     GOOD (77)
#  5 AVERAGE      B     8 AVERAGE (117)
#  6 AVERAGE      C    10 AVERAGE (117)
#  7 AVERAGE      A    32 AVERAGE (117)
#  8    GOOD      B    52     GOOD (77)
#  9 AVERAGE      C    67 AVERAGE (117)
# 10     BAD      C    41     BAD (220)

# rating levels are sorted
df1$rating <- factor(df1$rating,c("BAD","AVERAGE","GOOD"))

ggplot(df1,aes(x=rating,y=n,fill=div)) + geom_col() # plots in the order I want
ggplot(df1,aes(x=label,y=n,fill=div)) + geom_col()  # doesn't because levels aren't sorted

How do I manage to copy the factor order from one column to another ? I can make it work this way but I think it's really awkward:

lvls <- df1 %>% select(rating,label) %>% unique %>% arrange(rating) %>% extract2("label")
df1$label <- factor(df1$label,lvls)
ggplot(df1,aes(x=label,y=n,fill=div)) + geom_col()
www
  • 38,575
  • 12
  • 48
  • 84
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167

2 Answers2

4

Instead of adding a label column and use aes(x = label, you may stick to aes(x = rating, and create the labels in scale_x_discrete:

ggplot(df1, aes(x = rating, y = n, fill = div)) +
  geom_col() +
  scale_x_discrete(labels = df1 %>%
                     group_by(rating) %>%
                     summarize(n = sum(n)) %>%
                     mutate(lab = paste0(rating, " (", n, ")")) %>%
                     pull(lab))

enter image description here

Henrik
  • 65,555
  • 14
  • 143
  • 159
3

Once you have set the levels of rating, you can use forcats to set the levels of label by the order of rating like this...

library(forcats)
df1 <- df1 %>% group_by(rating) %>% 
               mutate(label=paste0(rating," (",sum(n),")")) %>% 
               ungroup %>% 
               arrange(rating) %>%              #sort by rating
               mutate(label=fct_inorder(label)) #set levels by order in which they appear

Or you can use forcats::fct_reorder to do the same thing...

df1$label <- fct_reorder(df1$label, as.numeric(df1$rating))

The plot then has the bars in the right order.

Andrew Gustar
  • 17,295
  • 1
  • 22
  • 32
  • 1
    Thank you, this package is interesting, exploring it I found the function `fct_reorder` that permits a less verbose approach : `df1$label <- fct_reorder(df1$label, as.numeric(df1$rating))` . Maybe you could add it to your answer ? – moodymudskipper Sep 25 '17 at 09:46
  • Thanks - will do! – Andrew Gustar Sep 25 '17 at 10:05