0

Trying to encode the text output received after parsing HTML data through BeautifulSoup library in Python 3. Getting following error:

----> gmtext.encode('ascii', errors='replace').replace("?", "")

TypeError: a bytes-like object is required, not 'str'

Here is the code implementation:

import urllib.request as urllib2
from bs4 import BeautifulSoup

articleURL = "http://digimon.wikia.com/wiki/Guilmon"

page = urllib2.urlopen(articleURL).read().decode('utf8', 'ignore')
soup = BeautifulSoup(page, 'lxml')
gmtext = soup.find('p').text

gmtext.encode('ascii', errors='replace').replace("?", "")

So far, all answers I found regarding this error have been about some sort of file open error.

Community
  • 1
  • 1
guilemon
  • 132
  • 12

2 Answers2

1

.replace() is a string function, but you're calling it after calling .encode(),
which returns "a bytes-like object" that you can't call .replace() on.

If you want to, you can do replacement before encoding like so:

gmtext.replace("?", "").encode('ascii', errors='replace')

Then it'll work.

Omar Einea
  • 2,478
  • 7
  • 23
  • 35
1

you can do replace with bytes (using b before the string) like:

gmtext.encode('ascii', errors='replace').replace(b"?", b"")
Totoro
  • 867
  • 9
  • 10