The dimnames
attribute of a matrix, if not NULL
, is a list of the form list(rownames, colnames)
storing the row names and column names of the matrix.
x <- matrix(1:9, 3L, 3L)
x
## [,1] [,2] [,3]
## [1,] 1 4 7
## [2,] 2 5 8
## [3,] 3 6 9
dimnames(x) <- list(letters[1:3], LETTERS[1:3])
x
## A B C
## a 1 4 7
## b 2 5 8
## c 3 6 9
Sometimes, it is convenient for the list itself to have names. These names act somewhat like axis titles:
names(dimnames(x)) <- c("lo", "UP")
x
## UP
## lo A B C
## a 1 4 7
## b 2 5 8
## c 3 6 9
lo
is printed on the same line as the column names, but it is really the title of the first dimension. Similarly, UP
is the title of the second dimension.
TermDocumentMatrix
and DocumentTermMatrix
objects are not true R matrices. They store nonzero elements in triplet format for efficiency, as well as some metadata. However, like true R matrices, they can have a dimnames
attribute. Since the rows and columns represent terms and documents (or vice versa), package tm
assigns names Terms
and Docs
to the dimnames
.
Taking an example from vignette("tm")
:
library("tm")
reut21578 <- system.file("texts", "crude", package = "tm")
reuters <- VCorpus(DirSource(reut21578, mode = "binary"),
readerControl = list(reader = readReut21578XMLasPlain))
tdm <- TermDocumentMatrix(reuters)
str(tdm)
## List of 6
## $ i : int [1:2255] 14 35 49 157 202 203 233 274 290 291 ...
## $ j : int [1:2255] 1 1 1 1 1 1 1 1 1 1 ...
## $ v : num [1:2255] 1 1 1 1 1 1 1 1 1 1 ...
## $ nrow : int 1266
## $ ncol : int 20
## $ dimnames:List of 2
## ..$ Terms: chr [1:1266] "..." "\"(it)" "\"demand" "\"expansion" ...
## ..$ Docs : chr [1:20] "127" "144" "191" "194" ...
## - attr(*, "class")= chr [1:2] "TermDocumentMatrix" "simple_triplet_matrix"
## - attr(*, "weighting")= chr [1:2] "term frequency" "tf"
Hence:
y <- as.matrix(tdm)[1:6, 1:6]
y
## Docs
## Terms 127 144 191 194 211 236
## ... 0 0 0 0 0 0
## "(it) 0 0 0 0 0 0
## "demand 0 1 0 0 0 0
## "expansion 0 0 0 0 0 0
## "for 0 0 0 0 0 0
## "growth 0 0 0 0 0 0
dimnames(y)
## $Terms
## [1] "..." "\"(it)" "\"demand" "\"expansion" "\"for" "\"growth"
##
## $Docs
## [1] "127" "144" "191" "194" "211" "236"