Actually I am doing review analytics for a cruise company. I can not tell you the whole procedure as it is very lengthy but at least a snapshot of it. I took all the reviews, divided them into sentences then extracted some phrases out of that review...e.g. 'Wonderful cabin', 'Excellent service'. Now for sentiment analysis I have to map all the nouns of that phrase to a particular theme. Now in that mapping I need all the synonyms of the nouns and all the words[As I asked] related to that word. so final result of my text mining will be more efficient. I think you get the little idea of what I am doing. I will repeat my question..in excel I have one row of words or say nouns..when I run a code[R,VBA or any] it should give me all the words related to those words..[I extracted synonyms with vba code]. Hope you got it..?
Asked
Active
Viewed 1,388 times
-3
-
1Hm what would be the use case for such a dictionary? Maybe stemming the words to a common root is an alternative. If not, you should perhaps look at a database like [WordNet](http://en.wikipedia.org/wiki/WordNet). The question is probably too broad anyway. – lukeA Feb 11 '15 at 11:20
-
I dont know about R bindings but see https://www.nodebox.net/code/index.php/Linguistics - verb.infinitive()/present_participle() - the approach they take is documented – Alex K. Feb 11 '15 at 11:27
-
@Roland: can an OP accept an answer to a question put on hold? – lawyeR Feb 11 '15 at 13:28
1 Answers
4
You can use the package tm
and its stemming capabilities.
If your text file is
text <- c("taste", "tastes", "tasting")
you can create a corpus
corpus <- Corpus(VectorSource(text)
and then have the stem function strip the words to their roots. (The helper function avoids some problems.)
stemDocumentfix <- function(x){ # put in business code
PlainTextDocument(paste(stemDocument(unlist(strsplit(as.character(x), " "))), collapse=' '))
}
corpus <- tm_map(corpus, stemDocumentfix)
inspect(corpus)
<<VCorpus (documents: 3, metadata (corpus/indexed): 0/0)>>
[[1]]
<<PlainTextDocument (metadata: 7)>>
tast
[[2]]
<<PlainTextDocument (metadata: 7)>>
tast
[[3]]
<<PlainTextDocument (metadata: 7)>>
tast
You might also look at the qdap
package, which offers a range of capabilities for text mining.

lawyeR
- 7,488
- 5
- 33
- 63
-
1
-
Thanks for the response...I think you assumed I have all the words but I do not..That's what I want.all the words!!! I have 1000's of words for which I have to find relevant words..I know some sort of dictionary will give me that but it's not one word there are 1000's of them...so how to do? – Dharam Feb 13 '15 at 12:52