I am completely new to the field of Data mining and WEKA tool (just installed it today).
I need to do topic identification based on short text sentences.
Let say I have several categories: - politics - sports - other
I am thinking of doing the following: Have a list of terms that I compare the text to:
- Sports:
- NFL
- NBA
- Touch down
- etc
- Politics:
- election
- president
- OBama
- etc
Also, I would like to add more categories.
Then I would apply some algorithm SVM or Naive Bayes with the help of WEKA.
Any idea on how to start doing this with WEKA?
I have searched some tutorials on WEKA but I can't seem to get any examples similar to what I am trying to do.
Any help to start me up will be appreciated.