TypeError while doing replace() on encoded BeautifulSoup result in Python

Question

Trying to encode the text output received after parsing HTML data through BeautifulSoup library in Python 3. Getting following error:

----> gmtext.encode('ascii', errors='replace').replace("?", "")

TypeError: a bytes-like object is required, not 'str'

Here is the code implementation:

import urllib.request as urllib2
from bs4 import BeautifulSoup

articleURL = "http://digimon.wikia.com/wiki/Guilmon"

page = urllib2.urlopen(articleURL).read().decode('utf8', 'ignore')
soup = BeautifulSoup(page, 'lxml')
gmtext = soup.find('p').text

gmtext.encode('ascii', errors='replace').replace("?", "")

So far, all answers I found regarding this error have been about some sort of file open error.

Both solutions seemed to work. Thanks all... – guilemon Mar 09 '18 at 17:30 — guilemon, Mar 09 '18 at 17:30

Omar Einea · Accepted Answer · 2018-03-09T15:30:12.293

1

.replace() is a string function, but you're calling it after calling .encode(),
which returns "a bytes-like object" that you can't call .replace() on.

If you want to, you can do replacement before encoding like so:

gmtext.replace("?", "").encode('ascii', errors='replace')

Then it'll work.

edited Mar 09 '18 at 15:30

answered Mar 09 '18 at 15:19

Omar Einea

2,478
7
23
35

score 1 · Answer 2 · answered Mar 09 '18 at 15:39

1

you can do replace with bytes (using b before the string) like:

gmtext.encode('ascii', errors='replace').replace(b"?", b"")

answered Mar 09 '18 at 15:39

Totoro

867
9
10

TypeError while doing replace() on encoded BeautifulSoup result in Python

2 Answers2