My data.csv
file contains the following:
id,name
143,The sky is blue.
21,The sun is bright.
23,The sun in the sky is bright.
Now, I can read the whole file like this:
> file_loc <- "test.csv"
> x <- read.csv(file_loc, header = TRUE)
> x <- data.frame(lapply(x, as.character), stringsAsFactors=FALSE)
> require(tm)
Loading required package: tm
> dd <- Corpus(DataframeSource(x))
> dtm <- DocumentTermMatrix(dd, control = list(weighting = weightTfIdf))
The resultant matrix I am getting is:
> as.matrix(dtm)
Terms
Docs 143 blue. bright. sky sun the
1 0.3962406 0.3962406 0.0000000 0.1462406 0.0000000 0
2 0.0000000 0.0000000 0.1949875 0.0000000 0.1949875 0
3 0.0000000 0.0000000 0.1169925 0.1169925 0.1169925 0
What I want is to make the id
column of the csv
file as the name of the docs
like this:
Terms
Docs blue. bright. sky sun the
143 0.3962406 0.0000000 0.1462406 0.0000000 0
21 0.0000000 0.1949875 0.0000000 0.1949875 0
23 0.0000000 0.1169925 0.1169925 0.1169925 0
Can anybody guide as to how can I achieve the desired result?