NLTK Red not recognised as an adjective

Question

The word 'red' is recognised as a verb. I believe it's because it thinks it is, following the pattern. In the pattern, a word with an '-ed' suffix is a verb...or something like that.

How can I make exceptions or fix this issue. It might occur with other words later.

def LanguageTokenize(read):
    read = word_tokenize(read)
    read = nltk.pos_tag(read)
    return read

>>> LanguageTokenize('the red cat')
 *returns [('the', 'DT'), ('red', 'VBN'), ('cat', 'NN')]

Welcome to Natural Language Processing, where things never work 100% they way you think they should! I wouldn't spend too much time trying to fix artificial sentences (or phrases) like that. Test the tools on real-world texts and see how they perform there. If you're not happy, you probably need to retrain. If you start defining exceptions for what you think is a corner case, you will never get to an end... — lenz, Sep 14 '15 at 20:27
SpaCy.io seems to be able to recognise it, I can't use that though because I don't have Linux. If it's not too obvious, you're not being much help. — deepadmax, Sep 14 '15 at 22:13
Make sure to have a look at this [question](https://stackoverflow.com/questions/30821188/python-nltk-pos-tag-not-returning-the-correct-part-of-speech-tag) — b3000, Sep 15 '15 at 08:48
@DeePad On StackOverflow, comments are not answers--they do not have to be that much help, and can just be a person's observations. So relax...if we jumped on people for everything, we might jump on them for tagging their posts "cat" and "red", eh? — HostileFork says dont trust SE, Sep 15 '15 at 20:35
Okay. And I couldn't come up with any other tags that did not require me to have a high Reputation Score to create new ones. Sorry... — deepadmax, Sep 16 '15 at 18:52
What @Lenz said: Don't worry about piecemeal "improvements", you have more important things to do and language is endless. That said, the NLTK's POS tagger has about [twice the error rate](http://spacy.io/blog/part-of-speech-POS-tagger-in-python/) of some other free tools. So if you _need_ accuracy it's worth your trouble to look for something else-- and still you'll get errors every other sentence. — alexis, Sep 16 '15 at 22:08
SpaCy.io is much better but errors occur when installing it or installing things I need to install it. Perhaps you might have a link to somewhere for how to set it up properly and make sure I encounter no errors during installation? — deepadmax, Sep 17 '15 at 23:35
@DeePad, you should start a new question for that, providing more details on the errors you get. If it's about installation only, you should probably post it on [Super User](http://superuser.com/) rather than here. — lenz, Sep 18 '15 at 20:07
The `red -> VBN` should disappear once this is complete: https://github.com/nltk/nltk/issues/1122 — alvas, Sep 19 '15 at 10:10

NLTK Red not recognised as an adjective

0 Answers0