I am writing a Python function that receives a string and decompresses the string using zlib
.
I am trying to translate from the following Go code to Python, that I know works (please excuse the one letter variable names, this code was written by someone else):
var b bytes.Buffer
r := bytes.NewReader(s) // s is a []byte
z, err := zlib.NewReader(r)
if err != nil {
// Error handling
}
_, err = io.Copy(&b, z)
if err != nil {
// Error handling
}
err = z.Close()
if err != nil {
// Error handling
}
The data is always received in Python as a string type rather than a bytes
or byte array type - this is outside my control. (For more context, see below.)
How can I properly encode or convert the string to a bytes
object that will be accepted by zlib.decompress
?
Do I need to set the wbits
parameter to something in particular?
Here is what I tried so far:
uncompressed = zlib.decompress(s.encode())
I am getting this error:
zlib.error: Error -3 while decompressing data: incorrect header check
I also tried
uncompressed = zlib.decompress(bytearray(s, 'utf-8'))
and
uncompressed = zlib.decompress(bytes(s, 'utf-8'))
but both failed with the same error.
Additional context
For those who are interested, here is some further context.
The system I am working on serializes a Go struct and sends the data as a raw array of bytes over the network. To save bandwidth, a portion of the data is compressed before serializing to bytes.
The reason the Go code always gets the data as a []byte
is because it can unmarshal the JSON raw bytes with json.Unmarshal
, like this:
env := RedactedStructName{}
err := json.Unmarshal(buf, &env) // buf is a []byte
I did not include this code above because I wanted to keep my question as simple as possible. In Python, the RedactedStructName
struct does not exist.
On the other end, the Python program that I am working on needs to deserialize the data and decompress the compressed data so that it can work on it.
The data, when passed through json.loads
, produces a Python dictionary. The compressed payload is a value in the dictionary. I don't know why, but json.loads
always causes the compressed data to be a Python string rather than a Python bytes
or bytearray object.