I am working on counting the frequency of unique words in a text document in R 3.2.2. I have collapsed so many articles into one single text document now and framed into corpus using tm
package.
desc<-paste(column_input, collapse=" ")
desrc <- VectorSource(desc)
decorp<-Corpus(desrc)
#dedtm <- DocumentTermMatrix(decorp)
#dedtm <- TermDocumentMatrix(decorp)
There are 12000 odd terms in that one text doc. To proceed forward with matrix operations, I am not quite sure which is better method. Term Document matrix or Document Term matrix ?
I hope that depends upon context. Is it better to use Term Document matrix rather than Document Term matrix in case of fewer documents with more terms. I just wanted to understand the logic behind this. So, I hope there is no need for any reproducible example. Any suggestions would be greatly appreciated.
Thanks in advance,
Bala