0

I have 32bit Python 3.4.1 installed and am using NLTK 3. All the collections and models are installed. When entering

>>> text = nltk.word_tokenize("this is not working")
>>> text
['this', 'is', 'not', 'working']
>>> nltk.pos_tag(text)

Or the tokens from a local file

I am getting the following error when trying to use the pos_tag of maxent_treebank_pos_tagger

Traceback (most recent call last):
  File "<pyshell#72>", line 1, in <module>
    nltk.pos_tag(text)
  File "C:\Python34\lib\site-packages\nltk\tag\__init__.py", line 100, in pos_tag
    tagger = load(_POS_TAGGER)
  File "C:\Python34\lib\site-packages\nltk\data.py", line 779, in load
    resource_val = pickle.load(opened_resource)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xcb in position 0: ordinal not in range(128)

Help!

wayne-m
  • 45
  • 7

1 Answers1

0

I traced down the error to being on a Windows 7 system and the encoding.

I followed answer from this thread, https://stackoverflow.com/a/25590163/1956823 Tried it on a Mac 10.10 system changed the encoding and it worked!

Community
  • 1
  • 1
wayne-m
  • 45
  • 7