I am trying to make a predictive model based on text mining. I am confused how many features should I set up in my model. I have 1000 document in my analysis (so corpus will take around 700). Number of terms in corpus is around 20 000, so it exceeds number of documents (P >> N). Having so much features has any sense?
Number of features in HashingTF method should be higher than total numbers of terms in the corpus? Or should I make it smaller (like 512 features?)
I am a little bit confused.