1

I'm looking to train a naive Bayes with some new data sources that haven't been used before. I've already looked at the Lee & Pang corpus of IMDB reviews and the MPQA opinion corpus. I'm looking for new web services that fit the following criteria.

  1. Easily Classified - must have a like/dislike or 5 star rating
  2. Readily available
  3. Pertain to new material (less important than the first two)

Here are some samples I have come up with on my own.

  • Etsy API
  • Rotten Tomatoes API
  • Yelp API

Any other suggestions would be much appreciated =)

Joey C.
  • 2,168
  • 2
  • 16
  • 14
Greg Guida
  • 7,302
  • 4
  • 30
  • 40
  • possible duplicate of [Training data for sentiment analysis](http://stackoverflow.com/questions/7551262/training-data-for-sentiment-analysis) – Fred Foo Feb 16 '12 at 10:03

2 Answers2

1

Take a look at sentiment140. It has a corpus that you can download and train with. You can easily extend to new tweets.

Shane
  • 657
  • 6
  • 21
1

In Pang&Lee's later work (2008) "Opinion Mining and Sentiment Analysis" here they have a section for publicly available resources. It has links to those corpora.

NLPer
  • 511
  • 1
  • 4
  • 9