0

I have an object like that:

str(apps)
 chr [1:17517] "35 44 33 40 33 40 44 38 33 37 37" ...

In each row, the number is separated by space.

corpus<-Corpus(VectorSource(apps))
dtm<-DocumentTermMatrix(corpus)
str(dtm)
List of 6
 $ i       : int(0) 
 $ j       : int(0) 
 $ v       : num(0) 
 $ nrow    : int 17517
 $ ncol    : int 0
 $ dimnames:List of 2
  ..$ Docs : chr [1:17517] "1" "2" "3" "4" ...
  ..$ Terms: NULL
 - attr(*, "class")= chr [1:2] "DocumentTermMatrix" "simple_triplet_matrix"
 - attr(*, "weighting")= chr [1:2] "term frequency" "tf"

I found that the Terms is NULL. I don't know exactly the data structure for DocumentTermMatrix(),I just following this thread Document-Term-Matrix of tm Package in R . Anyone can help solve it? Thanks

Community
  • 1
  • 1
ysfseu
  • 666
  • 1
  • 10
  • 20
  • 3
    Are all your "terms" two digit numbers? The by default the `tm` library requires at least three characters by default. See [this answer](http://stackoverflow.com/a/27751604/2372064) to change the defaults. – MrFlick Aug 11 '15 at 02:52
  • @MrFlick Thanks, it works fines! Could you put your comment in answer so that others can see this clearly? – ysfseu Aug 11 '15 at 03:04

0 Answers0