I've been stuck on this for way too long. I tried to decode the byte object received from the request. When I try to decode to UTF-8 and print, I don't see the string representation of the byte object. What am I missing here?
import urllib.request
url = 'https://www2.census.gov/geo/docs/reference/codes/files/national_cousub.txt'
data = urllib.request.urlopen(url)
counter = 0
for line in data:
print('byte string:')
print(line)
print('after decoding:')
print(line.decode('utf-8'))
counter = counter + 1
if counter > 5:
break
This is what I see on console:
byte string:
b'STATE,STATEFP,COUNTYFP,COUNTYNAME,COUSUBFP,COUSUBNAME,FUNCSTAT\r\r\n'
after decoding:
byte string:
b'AL,01,001,Autauga County,90171,Autaugaville CCD,S\r\r\n'
after decoding:
byte string:
b'AL,01,001,Autauga County,90315,Billingsley CCD,S\r\r\n'
after decoding:
byte string:
b'AL,01,001,Autauga County,92106,Marbury CCD,S\r\r\n'
after decoding:
byte string:
b'AL,01,001,Autauga County,92628,Prattville CCD,S\r\r\n'
after decoding:
byte string:
b'AL,01,003,Baldwin County,90207,Bay Minette CCD,S\r\r\n'
after decoding:
I am on Windows 10. Python version 3.5.5. I install python via anaconda. I am running this in PyCharm.
sys.stdout.encoding
= 'UTF-8'
Same results with print(line.decode('utf-8'), file=sys.stderr)