I apologize for the newbie-nature of this question - I have been trying to figure out Python packaging and namespaces, but the finer points seem to elude me. To wit, I would like to use the Python wrapper to Stanford part-of-speech tagger. I had no trouble finding the documentation here, which provides a use sample:
st = StanfordTagger('bidirectional-distsim-wsj-0-18.tagger')
st.tag('What is the airspeed of an unladen swallow ?'.split())
[('What', 'WP'), ('is', 'VBZ'), ('the', 'DT'), ('airspeed', 'NN'), ('of', 'IN'), ('an', 'DT'), ('unladen', 'JJ'), ('swallow', 'VB'), ('?', '.')]
This looks great, but I can't seem to get the right namespaces to show up in my local Python + NLTK installation (I have the latest NLTK version, and have tried the below in Python 2.6.x as well as 2.7.x):
>>> import nltk
>>> from nltk import *
>>> from nltk.tag import stanford
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: cannot import name stanford
I also tried this import statement, with same result:
>>> from nltk.tag.stanford import StanfordTagger
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named stanford
Searching around here on SO, I found this question, where the poster seems to be experiencing the exact same problem, but is able to get past the namespace step with:
The problem is that my nltk lib doesnt contain the stanford module. So I copied the same into the appropriate folder and compiled the same.
Sounds like it is indeed the same issue, except I can't for the life of me find any documentation for how to add modules to NLTK. Everything I read on NLTK web site implies that the Stanford module should already be packaged into the base install. So, a question in two parts:
- (Specific) Any suggestions for getting past this particular issue and starting to use StanfordTagger from Python? I know I can easily call the jar directly and then interpret the output in Python - that's all the Python wrapper does anyway - but I would like to get this to work out of principle, if nothing else.
- (General) What is a good pythonic approach to investigating missing packaging issues or dependencies such as above?