etree.tostring()
without additional arguments outputs ASCII-only data as a bytes
object. You could use etree.tounicode()
:
>>> from lxml import etree
>>> root = etree.Element('пример')
>>> print(etree.tostring(root))
b'<пример/>'
>>> print(etree.tounicode(root))
<пример/>
or specify a codec with the encoding
argument; you'd still get bytes however, so the output would need to be decoded again:
>>> print(etree.tostring(root, encoding='utf8'))
b'<\xd0\xbf\xd1\x80\xd0\xb8\xd0\xbc\xd0\xb5\xd1\x80/>'
>>> print(etree.tostring(root, encoding='utf8').decode('utf8'))
<пример/>
Setting the encoding to unicode
gives you the same output tounicode()
produces, and is the preferred spelling:
>>> print(etree.tostring(root, encoding='unicode'))
<пример/>