I know this looks embarrassingly easy, and I guess the problem is that I just don't have a clear understanding of all this bytes-str-unicode (and encoding-decoding, speaking frankly) stuff yet.
I've been trying to get my working code to run on Python 3. The part I'm stuck with is when I parse an XML with lxml
and decode a base64 string that is in that XML.
The code now works in the following manner:
I retrieve the binary data with an XPath query '.../binary/text()'
. This produces a one-element list containing a lxml.etree._ElementUnicodeResult
object. Then, with python 2, I was able to do:
decoded = source.decode('base64')
and finally
output = numpy.frombuffer(decoded)
However, on python 3 I get an error message saying
AttributeError: 'lxml.etree._ElementUnicodeResult' object has no attribute 'decode'
This is not so surprising, because lxml.etree._ElementUnicodeResult
is a subclass of str
.
Another way would be to get a real str
with the same data in it with
binary = tree.xpath('//binary')[0]
binary_string = binary.text
That would be essentially the same. So what do I do to decode it from base64? I've looked at the base64
module, but it takes a bytes
object as an argument, and I can't think of the way to present str
as bytes
, because if I try to construct a bytes
object, Python will try to encode the string, which I don't need.
Googling further, I came across the binascii
module (which is invoked indirectly from base64
anyway, if I'm not mistaken), but calling binascii.b2a_base64()
on my string produces
TypeError: 'str' does not support the buffer interface
P.S. I've even found an answered question on how to decode a hex string in Python 3, but this is done with a dedicated method bytes.fromhex()
so I don't see how it would be helpful.
Could someone please tell me what I'm missing? I'm afraid most of the post is irrelevant and only aggravates my shame, but at least you guys know what I tried.