Python (requests) encoding trouble (UTF-8 - CP1251)

Question

I trying to get this kind of URL http://example.com/?param=%DD%CC%C0-15 with requests python extension like this:

group = "ЭМА-15".encode('cp1251')
r = requests.get('http://example.com/?param=' + group)
r.encoding = "cp1251"

(because site works with windows-1251 (cp1251) encoding)

And getting errorat line 2: UnicodeDecodeError: 'utf8' codec can't decode byte 0xdd in position 82: invalid continuation byte But this sequence of bytes (0xDD (%DD)...) is exactly what I need. How can I fix that?

Please have a look my answer. – Ruhul Amin Dec 08 '16 at 22:38 — Ruhul Amin, Dec 08 '16 at 22:38

score 1 · Answer 1 · answered Dec 08 '16 at 22:30

I guess you are trying to display cp1251 characters but your editor is configured to use utf8 The coding: cp1251 is only used by the Python interpreter to convert characters from source python files that are outside of the ASCII range. Try:

group = "ЭМА-15".decode('utf8').encode('cp1251')
r = requests.get('http://example.com/?param=' + group)
r.encoding = "cp1251"

When I run on my terminal,

>>> "ЭМА-15".decode('utf8').encode('cp1251')
'\xdd\xcc\xc0-15'

No, it prints successfully, the problem is in the request function — Vlad Markushin, Dec 08 '16 at 22:38

score 1 · Accepted Answer · answered Dec 08 '16 at 22:36

There are two things. 1. Python interpreter needs to know the encoding of "ЭМА-15" string in the source 2. query parameter is usually handled by requests but since you are constructing the URL manually, it's best to quote it by yourself.

# -*- coding: utf-8 -*-
import urllib
import requests

group = u"ЭМА-15".encode('cp1251')
param = urllib.quote_plus(group)
print(param)
r = requests.get('http://example.com/?param=' + param)

Output

%DD%CC%C0-15

I was waiting for a such answer. Thanks a lot. – Vlad Markushin Dec 08 '16 at 22:39 — Vlad Markushin, Dec 08 '16 at 22:39

Python (requests) encoding trouble (UTF-8 - CP1251)

2 Answers2