0

I have a data frame with the column word and I want to show the top 10 words in the text in a bar chart with ggplot.

This is the code:

text_df %>% count(word, sort = TRUE) %>% top_n(10)

The result is as expected. Now I want to show that in graph:

text_df %>% count(word, sort = TRUE) %>% top_n(10) >%>
ggplot(aes(word, n)) + geom_col()

The sorting is now lost and the ten words appear in a (for me) randomly order. Why is the sorting lost? Do I use the commands wrongly?

deancorso
  • 3
  • 3
  • 1
    Try `text_df %>% count(word, sort = TRUE) %>% top_n(10) %>% mutate(word = forcats::fct_inorder(word))`. That will "lock in" the ordering by making `word` and ordered factor, in the order it appears in your sorted output. – Jon Spring Sep 14 '19 at 15:36

1 Answers1

1

First, the reason your ggplot loses its ordering is because ggplot expects the inputs as factors with levels.

Using fct_reorder from forcats library before plotting & sending to ggplot, will sort the issue you are facing

library(forcats)
library(ggplot2)
temp %>% count(word, sort = TRUE) %>% top_n(10) %>% 
mutate(word=fct_reorder(word,-n)) %>%   
ggplot(aes(word, n)) + geom_col()