-1

I am planning to get some review data from tripadvisor and I want to be able to extract hotel related aspects and assign polarity to them and classify them as negative or positive.

What tools can I use for this purpose and how and where do I start? I know there are some tools like GATE, Stanford NLP, Open NLP etc, but would I be able to perform the above specific tasks? If so, please let me know an approach to go forward. I am planning to use Java as the choice of programming language and would like to use some APIs

Also, should I go ahead with a rule based approach or a ML approach that uses a trained corpus of reviews, so some other approach completely?

P.S : I am new to NLP and I need some help to go forward.

Kripa Jayakumar
  • 891
  • 9
  • 16

2 Answers2

0

Stanford CoreNLP has lot of features in one package

  • POS Tagger
  • NER Model
  • Sentiment Analysis
  • Parser

But in Apache OpenNLP package consist

  • Sentence Detector
  • POS tagger
  • NER
  • Chunker

But they don't have built in feature to find out Sentiment polarity So you have to pass your tags to other libraries such like SentiwordNet to find out the polarity.

I used used OpenNLP and Stanford Core NLP. But for both you need to modify sentiment corpus with respect to restaurant domain.

Bruce
  • 8,609
  • 8
  • 54
  • 83
  • Is it really necessary to have a corpus? Are there any other appraoches available? pardon my ignorance. I am new to all these concepts – Kripa Jayakumar Jan 17 '15 at 18:32
  • You likely need to produce your own corpus, unless you can find something very similar to your particular text domain (hotel reviews). You could do this automatically by storing review-star rating mappings. – Jon Gauthier Jan 18 '15 at 04:31
0

You can try ConceptNet (http://conceptnet5.media.mit.edu/). See for instance here (at the bottom of the page): https://github.com/commonsense/conceptnet5/wiki/API how to "see 20 things in English with the most positive affect:"

permanganate
  • 709
  • 1
  • 6
  • 19