-1

I have scraped data from a website but for some items it shows me below error:

UnicodeEncodeError: 'ascii' codec can't encode character u'\2019' in position 4: ordinal not in range(128)

I have even put "# -- coding: utf-8 --" at the top of the document but it is not worked. Please help.

bgse
  • 8,237
  • 2
  • 37
  • 39
michael
  • 21
  • 8

1 Answers1

1

Either always consider the unicode content or remove the unicode content entirely. The error is occurring because you (or some library methods you're using) are trying to convert utf-8 content into ascii without ignoring the errors.

# Ignore unicode content
content_string = content_string.encode('ascii', 'ignore')

# Or make sure you handle unicode content as such. It would have been
# easier if you're using Python3x.

The purpose of # -- coding: utf-8 -- is to allow explicitly adding Unicode content into a python code file, and not to set the default encoding.

# -- coding: utf-8 --
book_name = 'Les Misérables'
hspandher
  • 15,934
  • 2
  • 32
  • 45