I build a model of document classification from the training set of documents. Classification is done by the vector representation of each document, that is, a row in the Document-Term Matrix. Then to test the model, I need the representation of each document in the test set. How can I do that since not every term has been included in the training set (hence the Document-Term Matrix)?
Asked
Active
Viewed 28 times