I'm taking part in this Kaggle competition and I'm wondering if anyone has any familiarity with the textmatrix function from the LSA package in R.
Basically, the textmatrix function accepts a directory as an argument and it will create a textmatrix using all text files found within the specified directory.
Unfortunately, the textmatrix function will throw an error when it comes across a text file that contains zero terms (this can happen if you filter using stop words, for example).
Does anyone know of a simple way to have textmatrix ignore files that end up with zero terms? Or of a relatively quick way to identify and remove these files?
TIA!