1

I'm trying to compress a dictionary for accessing an API.
I read the code of someone compressing the data with JavaScript and a library called "pako" and tried it myself. It works perfectly:

var myDictionary = {...}

var b = pako.deflate(JSON.stringify(a), { 
    to: "string", 
    gzip: !0 
  }); 
  return b = btoa(b) 
}

var compressed = b(n)

Now I would like to do the same with Python: I tried the following, but the result is different and doesn't work:

my_dictionary = {...}
data_json = json.dumps(my_dictionary, ensure_ascii=False)
data_gzip = zlib.compress(bytes(data_json, "utf-8"))
compressed = base64.b64encode(bytes(str(data_gzip), "utf-8"))

Has anyone an ideal how to solve this problem with Python? Is there a similar library to pako for Python?

Developer
  • 2,113
  • 2
  • 18
  • 26
  • What does "doesn't work" mean? You need to be specific on your issue. – roganjosh Sep 21 '17 at 18:56
  • "Result is different" would be expected, and is likely irrelevant. What matters is that result decompresses to the original data. So what do you mean exactly by "and doesn't work"? What happens? – Mark Adler Sep 21 '17 at 20:58
  • 1
    The output is not the same like the one JavaScript produces. So the API doesn't accept the output from Python. But the output from JavaScript works. – Developer Sep 22 '17 at 16:26

2 Answers2

8

In case someone still looking for pako equivalent methods in python 3(not tested in python 2).

pako.deflate() method equivalent in python:

def pako_deflate(data):
    compress  = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION, zlib.DEFLATED, 15, 
        memLevel=8, strategy=zlib.Z_DEFAULT_STRATEGY)
    compressed_data = compress.compress(js_string_to_byte(js_encode_uri_component(data)))
    compressed_data += compress.flush()
    return compressed_data

pako.deflateRaw() equivalent in python:

def pako_deflate_raw(data):
    compress = zlib.compressobj(
        zlib.Z_DEFAULT_COMPRESSION, zlib.DEFLATED, -15, memLevel=8,
        strategy=zlib.Z_DEFAULT_STRATEGY)
    compressed_data = compress.compress(js_string_to_byte(js_encode_uri_component(data)))
    compressed_data += compress.flush()
    return compressed_data

pako.inflate() method equivalent:

def pako_inflate(data):
    decompress = zlib.decompressobj(15)
    decompressed_data = decompress.decompress(data)
    decompressed_data += decompress.flush()
    return decompressed_data

pako.inflateRaw() method equivalent:

def pako_inflate_raw(data):
    decompress = zlib.decompressobj(-15)
    decompressed_data = decompress.decompress(data)
    decompressed_data += decompress.flush()
    return decompressed_data

Some utility functions used in above functions are:

from urllib.parse import quote, unquote
import base64

def js_encode_uri_component(data):
    return quote(data, safe='~()*!.\'')


def js_decode_uri_component(data):
    return unquote(data)


def js_string_to_byte(data):
    return bytes(data, 'iso-8859-1')


def js_bytes_to_string(data):
    return data.decode('iso-8859-1')


def js_btoa(data):
    return base64.b64encode(data)


def js_atob(data):
    return base64.b64decode(data)
ydrall
  • 338
  • 1
  • 3
  • 11
  • 1
    Excellent answer, but for the deflate I had to remove the conversion to get clean way in and out. Hope this helps someone compressed_data = compress.compress(data) – user7660047 Apr 30 '22 at 09:43
0
from urllib import quote, unquote
import base64
import zlib


def js_encode_uri_component(data):
    return quote(data)

def js_string_to_byte(data):
    return bytes(data)


def js_bytes_to_string(data):
    return data.decode('iso-8859-1')


def js_btoa(data):
    return base64.b64encode(data)


def js_atob(data):
    return base64.b64decode(data)


def pako_inflate_raw(data):
    decompress = zlib.decompressobj(-15)
    decompressed_data = decompress.decompress(data)
    decompressed_data += decompress.flush()
    return decompressed_data



def pako_deflate_raw(data):
    compress = zlib.compressobj(
        zlib.Z_DEFAULT_COMPRESSION, zlib.DEFLATED, -15, 8,
        zlib.Z_DEFAULT_STRATEGY)
    compressed_data = compress.compress(js_string_to_byte(js_encode_uri_component(data)))
    compressed_data += compress.flush()
    return compressed_data

data = "0b cb 49 cc 4b 67 30 64 28 c8 2f 4f 2d 2a ce 48 cd c9 61 70 d3 4b ce 4f 49 d5 2b c9 cc 67 30 34 35 60 48 4d ce c8 57 50 37 a8 48 31 30 a8 b0 1e 08 00 76 80 52 5a 9a a5 a1 a1 a1 25 18 28 55 a8 bb e9 65 e6 15 94 96 80 5d 69 c0 10 96 58 94 5e 0c a4 83 00".replace(" ","").decode("hex")

data_main = """Vlang\x001\x00powershell\x00F.code.tio\x0038\x00echo \'0xd00\xe2\x80\x83;echo "ff9111999999"\xe2\x80\x83\'F.input.tio\x000\x00Vargs\x000\x00R"""

print repr(pako_inflate_raw(data))

print repr(pako_deflate_raw(data_main))