I'm trying to use python to find some words across webpages (just to practice) but I keep running into a problem. This is it:
url = 'someWikipage'
hdrs = { 'User-Agent': "Mozilla/5.0 (X11; U; Linux i686) Gecko/20071127 Firefox/2.0.0.11" }
req = request.Request(url,None,hdrs)
response = urlopen(req)
htmlBytes = response.read()
htmlBytes.decode('utf-8')
It brakes on the last line giving me an error (a common one);
UnicodeEncodeError: 'charmap' codec can't encode character '\u2010' in position 18573: character maps to <undefined>
Any ideas about how to prevent or ignore this?