2

How can I async the following code, without first downloading the entire file:

import gzip
import urllib.request

def gunzip_url(url: str):
    with gzip.open(urllib.request.urlopen(url), 'rt') as f:
        return f.read()

The following works, but it is not on the fly (downloads entire gz file before decompressing):

import aiohttp
import asyncio
import gzip

async def gunzip_url(client: aiohttp.ClientSession, url: str):
    async with client.get(url) as resp:
        gz = await resp.read()
        return gzip.decompress(gz)

async def main():
    async with aiohttp.ClientSession() as client:
        coros = [gunzip_url(client, 'http://some.file/1.gz'),
                 gunzip_url(client, 'http://some.file/2.gz')]
        return await asyncio.gather(*coros)

data = asyncio.run(main())

The following works, but is also not on the fly:

from aioify import aioify
agzip = aioify(obj=gzip, name='agzip')

async def gunzip_url(client: aiohttp.ClientSession, url):
    async with client.get(url) as resp:
        return await agzip.decompress(await resp.read())
Sparkler
  • 2,581
  • 1
  • 22
  • 41
  • If the server sends `Content-Encoding: gzip` then you get this for free – fafl Feb 15 '21 at 21:55
  • @fafl, yes, I've seen `ClientSession`'s `auto_decompress: bool = True` but I wasn't lucky with my case. That being said, is it possible to fake it, or somehow force `ClientSession` to assume `Content-Encoding: gzip`? – Sparkler Feb 15 '21 at 22:07
  • You can process the compressed data [chunk by chunk](https://docs.aiohttp.org/en/stable/streams.html) using aiohttp and you can use [zlib](https://stackoverflow.com/a/60855691) to decompress data from those chunks. – Ionut Ticus Feb 19 '21 at 19:46
  • https://github.com/chimpler/async-stream – fuzzyTew Mar 21 '23 at 09:32

0 Answers0