I use quanteda to build a document term matrix:
library(quanteda)
mytext = "This is my old text"
dtm <- dfm(mytext, tolower=T)
convert(dtm,to="data.frame")
Which yields:
doc_id this is my old text
1 text1 1 1 1 1 1
I need to fit "new" text (a new corpus) to my existing dtm (using the same vocabulary so that the same matrix columns will be present)
Suppose my "new" text/corpus would be:
newtext = "This is my new text"
How can I fit this "new" text/corpus to the existing dtm vocabulary, so to get a matrix like:
doc_id this is my old text
1 text1 1 1 1 0 1