I have a list of words and would like to keep only nouns.
This is not a duplicate of Extracting all Nouns from a text file using nltk
In the linked question a piece of text is processed. The accepted answer proposes a tagger. I'm aware of the different options for tagging text (nlkt, textblob, spacy), but I can't use them, since my data doesn't consist of sentences. I only have a list of individual words:
would
research
part
technologies
size
articles
analyzes
line
nltk
has a wide selection of corpora. I found verbnet
with a comprehensive list of verbs. But so far I didn't see anything similar for nouns. Is there something like a dictionary, where I can look up if a word is a noun, verb, adjective, etc ?
This could probably done by some online service. Microsoft translate for example returns a lot of information in their responses: https://learn.microsoft.com/en-us/azure/cognitive-services/translator/reference/v3-0-dictionary-lookup?tabs=curl But this is a paid service. I would prefer a python package.
Regarding the ambiguity of words: Ideally I would like a dictionary that can tell me all the functions a word can have. "fish" for example is both noun and verb. "eat" is only verb, "dog" is only noun. I'm aware that this is not an exact science. A working solution would simply remove all words that can't be nouns.