0

I am using nltk.pos_tag and sometimes I got some weird tags.

For Example

[(u'father', 'NN'), (u'always', 'RB'), (u'interested', 'JJ'), (u'magic', 'NN'),
 (u'carnival', 'NN'), (u'tricks', 'NNS'), (u',', ','), (u'wanting', 'VBG'),
 (u'see', 'NN'), (u'worked', 'VBD'), (u'.', '.'), (u'one', 'CD'),
 (u'things', 'NNS'), (u'knew', 'VBD')]

(u'interested', 'JJ'),(u'see', 'NN') -these things are really making my analysis waste and faulty. Please help me suggesting any other way of tagging POS-tags.

Vincent Savard
  • 34,979
  • 10
  • 68
  • 73
SKY
  • 175
  • 2
  • 8
  • Please post the example sentence. – Riyaz Feb 25 '16 at 15:50
  • `My father was always interested in magic and carnival tricks, and wanting to see how they worked. One of the things he knew ...`. -It was the sentence. In the question I removed the stopwords before POS-tagging. – SKY Feb 25 '16 at 15:57
  • 1
    Oh, you are not supposed to remove any thing from the sentence before POS tagging. Remove them afterwards. – Riyaz Feb 25 '16 at 15:59
  • Okay if so, Output will be- `[('My', 'PRP$'), ('father', 'NN'), ('was', 'VBD'), ('always', 'RB'), ('interested', 'JJ'), ('in', 'IN'), ('magic', 'JJ'), ('and', 'CC'), ('carnival', 'NN'), ('tricks', 'NNS'), (',', ','), ('and', 'CC'), ('wanting', 'VBG'), ('to', 'TO'), ('see', 'VB'), ('how', 'WRB'), ('they', 'PRP'), ('worked', 'VBD'), ('.', '.'), ('One', 'CD'), ('of', 'IN'), ('the', 'DT'), ('things', 'NNS'), ('he', 'PRP'), ('knew', 'VBD')]` - but ('interested', 'JJ')?? – SKY Feb 25 '16 at 16:02
  • Use CRF tagger, instead of HMM or perceptron. – Riyaz Feb 25 '16 at 16:06
  • 1
    Possible duplicate of [Python NLTK pos\_tag not returning the correct part-of-speech tag](http://stackoverflow.com/questions/30821188/python-nltk-pos-tag-not-returning-the-correct-part-of-speech-tag) – alexis Feb 25 '16 at 21:00

0 Answers0