0

I'm trying to use Pyenchant to spell check each entry in a column called pets in a pandas dataframe called house.

import enchant
dict = enchant.Dict("en_US")

for pets in house:
     [pets] = dict.suggest([pets])[0]

When I run this code, I get an error about not passing bytestrings to Pyenchant. Not sure what to do. Full error text below:

File "myfile", line 20, in [pets] = dict.suggest([pets])[0] File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/enchant/init.py", line 662, in suggest word = self._StringClass(word) File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/enchant/utils.py", line 152, in new raise Error("Don't pass bytestrings to pyenchant") enchant.errors.Error: Don't pass bytestrings to pyenchant

How can I fix this? Thanks.

CaptainBear
  • 167
  • 3
  • 12

1 Answers1

0

If your dataframe contains bytestrings, you will need to decode them before you pass them to enchant; you can do this with .str.decode('utf-8'). Then to apply your function, the cleanest way to approach this type of situation is usually to use map across your Series rather than iterating. (Also you shouldn't shadow the keyword dict):

checker =  enchant.Dict("en_US")
house = pd.Series([b'caat', b'dogg'])

#decode the bytestrings
house = house.str.decode('utf-8')

#map the spell checker
house.map(lambda x: checker.suggest(x)[0])

# Out[19]:
# 0    cat
# 1    dog
# dtype: object
maxymoo
  • 35,286
  • 11
  • 92
  • 119
  • Unfortunately, I'm getting the same error using the decode line of code. I had to load House from a csv using the encoding ISO-8859-1 because it contains weird characters. Not sure the decode is working--> house['pet'] = df2['pet'].str.decode('utf-8'). Thoughts? – CaptainBear Sep 15 '16 at 03:20
  • Unfortunately, that's not working either. I'm guessing special characters are stuck in the pet column and pyenchant can't interpret them. – CaptainBear Sep 15 '16 at 04:01
  • can you find the problematic word and include it in your question? – maxymoo Sep 15 '16 at 05:33