Create other forms[noun,adjective,plural,verb..everything] of a word

Question

Actually I am doing review analytics for a cruise company. I can not tell you the whole procedure as it is very lengthy but at least a snapshot of it. I took all the reviews, divided them into sentences then extracted some phrases out of that review...e.g. 'Wonderful cabin', 'Excellent service'. Now for sentiment analysis I have to map all the nouns of that phrase to a particular theme. Now in that mapping I need all the synonyms of the nouns and all the words[As I asked] related to that word. so final result of my text mining will be more efficient. I think you get the little idea of what I am doing. I will repeat my question..in excel I have one row of words or say nouns..when I run a code[R,VBA or any] it should give me all the words related to those words..[I extracted synonyms with vba code]. Hope you got it..?

Hm what would be the use case for such a dictionary? Maybe stemming the words to a common root is an alternative. If not, you should perhaps look at a database like [WordNet](http://en.wikipedia.org/wiki/WordNet). The question is probably too broad anyway. — lukeA, Feb 11 '15 at 11:20
I dont know about R bindings but see https://www.nodebox.net/code/index.php/Linguistics - verb.infinitive()/present_participle() - the approach they take is documented — Alex K., Feb 11 '15 at 11:27
@Roland: can an OP accept an answer to a question put on hold? — lawyeR, Feb 11 '15 at 13:28

lawyeR · Answer 1 · 2015-02-11T13:27:17.633

4

You can use the package tm and its stemming capabilities.

If your text file is

text <- c("taste", "tastes", "tasting")

you can create a corpus

corpus <- Corpus(VectorSource(text)

and then have the stem function strip the words to their roots. (The helper function avoids some problems.)

stemDocumentfix <- function(x){ # put in business code
  PlainTextDocument(paste(stemDocument(unlist(strsplit(as.character(x), " "))), collapse=' '))
}

corpus <- tm_map(corpus, stemDocumentfix)

inspect(corpus)
<<VCorpus (documents: 3, metadata (corpus/indexed): 0/0)>>

[[1]]
<<PlainTextDocument (metadata: 7)>>
tast

[[2]]
<<PlainTextDocument (metadata: 7)>>
tast

[[3]]
<<PlainTextDocument (metadata: 7)>>
tast

You might also look at the qdap package, which offers a range of capabilities for text mining.

edited Feb 11 '15 at 13:27

answered Feb 11 '15 at 12:06

lawyeR

7,488
5
33
63

1

Did this answer work for you? Would you consider accepting it? – lawyeR Feb 12 '15 at 02:32
Thanks for the response...I think you assumed I have all the words but I do not..That's what I want.all the words!!! I have 1000's of words for which I have to find relevant words..I know some sort of dictionary will give me that but it's not one word there are 1000's of them...so how to do? – Dharam Feb 13 '15 at 12:52

Create other forms[noun,adjective,plural,verb..everything] of a word

1 Answers1