I'd like to create a plot from the Textmining with R web textbook, but with my data. It essentially searches for the top terms per year and graphs them (Figure 5.4: http://tidytextmining.com/dtm.html). My data is a bit cleaner than the one they started with, but I'm new to R. My data has a "Date" column that is in 2016-01-01 format (it's a date class). I only have data from 2016, so I want to do the same thing, but more granular, (i.e. by month or by day)
library(tidyr)
year_term_counts <- inaug_td %>%
extract(document, "year", "(\\d+)", convert = TRUE) %>%
complete(year, term, fill = list(count = 0)) %>%
group_by(year) %>%
mutate(year_total = sum(count))
year_term_counts %>%
filter(term %in% c("god", "america", "foreign", "union", "constitution",
"freedom")) %>%
ggplot(aes(year, count / year_total)) +
geom_point() +
geom_smooth() +
facet_wrap(~ term, scales = "free_y") +
scale_y_continuous(labels = scales::percent_format()) +
ylab("% frequency of word in inaugural address")
The idea is that I would chose my specific words from my text and see how they change over the months.
Thank you!