-1

In Text Mining with R, methods for unsupervised classification of documents, such as blog posts or news articles, are introduced. This is work for topic modeling. I'm running the codes enclosed in this link, but I do not know how obtain Figure 6.3, "Words with the greatest difference in beta between topic 2 and topic 1".

Any suggestions please?

Mark
  • 1,577
  • 16
  • 43

1 Answers1

2

This book has source available, you can just click the edit button and be taken to the GitHub project with the current page to edit. Just navigate to the chapter that you need (a Rmd file) and look for the text closest to the image.

Thankfully this image was also made with R, so you can just check: here

Posting for completeness:

beta_spread %>%
  group_by(direction = log_ratio > 0) %>%
  top_n(10, abs(log_ratio)) %>%
  ungroup() %>%
  mutate(term = reorder(term, log_ratio)) %>%
  ggplot(aes(term, log_ratio)) +
  geom_col() +
  labs(y = "Log2 ratio of beta in topic 2 / topic 1") +
  coord_flip()
Yuri-M-Dias
  • 610
  • 1
  • 11
  • 25