How to create a Document Term Matrix in R (using LSA)?

Question

I'm trying to build a document matrix using the LSA package for my research in R. The txt file I'm trying to read contains text from 10,000 tweets, and there is data in there. But loading TDM results in the error below. I'm using this package specifically as it relates to a paper that I'm using, and has some other interesting tools for later down the line. The paper is here: Gefen, D., Endicott, J. E., Fresneda, J. E., Miller, J., & Larsen, K. R. (2017). A guide to text analysis with latent semantic analysis in R with annotated code: Studying online reviews and the stack exchange community. Communications of the Association for Information Systems, 41(1), 21.

Can anyone help? Thanks

Below is my code:

install.packages("LSAfun")
library(LSAfun)
library(lsa)

#load stopwords
data(stopwords_en) 

#Load text
source_dir = "C:\\Users\\Alexander Hiscock\\Desktop\\Phd R Stuff\\export_txt2"

#create tdm
TDM <- textmatrix(source_dir, stopwords= stopwords_en, stemming=TRUE, 
removeNumber=FALSE, minGlobFreq=2) 

TDM
# Error in if ((nc <= (3 * bag_cols)) && (nr <= (3 * bag_lines))) { : 
#   missing value where TRUE/FALSE needed

How to create a Document Term Matrix in R (using LSA)?

0 Answers0