I have got empty documents in my document term matrix. I need to remove them. This is code that I used to build the DocumentTermMatrix:
tweets_dtm_tfidf <- DocumentTermMatrix(tweet_corpus, control = list(weighting = weightTfIdf))
And this the warning Message that I am getting:
Warning message:
In weighting(x) :
empty document(s): 823 3795 4265 7252 7295 7425 8240 8433 9303 12160 12278 14465 15166 15485 15933 20775 21666 21807 26131 27039 34035 34050 34101
I tried removing these empty documents using this code:
rowTotals <- apply(tweets_dtm_tfidf , 1, sum)
dtm_tfidf <- tweets_dtm_tfidf[rowTotals> 0, ]
Here is the error that I am getting trying to remove them:
> rowTotals <- apply(tweets_dtm_tfidf , 1, sum)
Error: cannot allocate vector of size 6.8 Gb
Any idea on how to go about this? Thanks for any suggestions in advance.