-3

The code below works perfectly in python2.7

for thepkg in mypkgs.get('package'):
    pkgname = thepkg.get('name').encode('utf-8').replace(' ', '_')
    print('                             <option value="'+pkgname+'">'+pkgname+'</option>')

but in python3 it is throwing the error

    pkgname = thepkg.get('name').encode('utf-8').replace(' ', '_')
TypeError: a bytes-like object is required, not 'str'

I tried various variations, but either a different error pops up or the HTML page rendered is not displayed correctly when the thepkg.get('name') value is a non English string like for example a Japanese or Chinese name. Again it works perfectly in HTML on python2.7

Anoop P Alias
  • 373
  • 1
  • 6
  • 15
  • 2
    The output of `str.encode` on Python 3.x is a `bytes` object. `bytes.replace` requires the arguments to also be `bytes` objects. Read e.g. https://docs.python.org/3/whatsnew/3.0.html#text-vs-data-instead-of-unicode-vs-8-bit. Also look at the various https://stackoverflow.com/search?q=%22TypeError%3A+a+bytes-like+object+is+required%2C+not+%27str%27%22. – jonrsharpe Nov 11 '20 at 14:47
  • Does this answer your question? [TypeError: a bytes-like object is required, not 'str' in python and CSV](https://stackoverflow.com/questions/34283178/typeerror-a-bytes-like-object-is-required-not-str-in-python-and-csv) – Sabito stands with Ukraine Nov 11 '20 at 14:58
  • pkgname = thepkg.get('name').encode('utf-8').decode() shows the pkgname is now a str object ( I removed the replace so not to confuse further), but still the print() doesn't display correct HTML – Anoop P Alias Nov 11 '20 at 15:50
  • To add more. It displays correctly if the script is executed on the command line. Just that the browser does not interpret it correctly somehow – Anoop P Alias Nov 11 '20 at 16:12

1 Answers1

1

The issue was caused by sys.stdout.encoding on the HTML being set to ANSI_X3.4-1968 When I reset the encoding to utf-8 using

sys.stdout = io.TextIOWrapper(sys.stdout.detach(), encoding = 'utf-8')
sys.stderr = io.TextIOWrapper(sys.stderr.detach(), encoding = 'utf-8')

before the print() , the characters in HTML started displaying fine

I did not need to use .encode('utf-8') also as by default thepkg.get('name') was a unicode string

Anoop P Alias
  • 373
  • 1
  • 6
  • 15