0

I have calculated lexical diversity for my DFM in Quanteda, and want to plot that over time. I have variables for year, month, and date in my corpus for each document as docvars. Is there some way to combine these data and produce a plot of lexical diversity over time?

nasserq
  • 1
  • 3

1 Answers1

0

To plot lexical diversity over time, you need to calculate the lexical diversity over time i.e. group the data by the time (month or year - depends on you) and then calculate the lexical diversity for each group. Once you have this, you will have one value per group, which then can be used to plot.

Example:

lex_div <- doc1_final %>%  
group_by(Page) %>%
summarise(lex_div = length(unique(word))/length(word)) 

I have attached a picture of the doc1_final object. It basically is a dataframe broken down into words i.e. One word per row. I then pass the doc1_final object to the group_by function and then perform your calculation on the grouped data.

You will need to install 'dplyr' package to be able to run the above code.enter image description here

Ali Jawaad
  • 11
  • 3