A certain Python API returns u'J\xe4rvenp\xe4\xe4'
for the finish word Järvenpää.
where \xe4 == ä
I then am calling email.header to add this field to a header to be printed.
email.header
falls over when it tries to decode the umlaut:
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/email/header.py", line 73, in decode_header
header = str(header)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 1: ordinal not in range(128)
I've tried a couple of things:
- Addding
# -*- coding: utf-8 -*-
to the top of header.py - Calling
unicode()
on the Finnish string before passing it to email.header - Calling
.encode('utf-8')
on the Finnish string before passing it to email.header
None have solved the problem. What I am doing wrong? I'd imagine that a solution won't involve modifying header.py
(a core Python module).
Python version: 2.7.10
UPDATE:
Header() is not being instantiated directly. Rather I'm callind the decode_header() function on the string:
email.Header.decode_header(theString)
It seems now that simply extend this thus:
email.Header.decode_header(theString.encode('utf-8'))
solves the problem