13

This was based on a typo, and simple mistake.

Not deleting since it has sample code for httpx.

I'm attempting to leverage asyncio to parallelize several long-ish running web requests. Because I'm migrating from the requests library, I would like to use the httpx library, due to the similar API. My environment is a Python 3.7.7 Anaconda distribution with all required packages installed (Windows 10).

However, despite being able to use httpx for synchronous web requests (or for serially executing async requests that run one after another), I have not been able to succeed at running more than one async request at a time, despite easily doing so with the aiohttp library.

Here's example code that runs cleanly in aiohttp: (Note that I'm running in Jupyter, so I already have an event loop, thus the lack of asyncio.run().

import aiohttp
import asyncio
import time
import httpx

async def call_url(session):
    url = "https://services.cancerimagingarchive.net/services/v3/TCIA/query/getCollectionValues"        
    response = await session.request(method='GET', url=url)
    #response.raise_for_status() 
    return response

for i in range(1,5):
    start = time.time() # start time for timing event
    async with aiohttp.ClientSession() as session: #use aiohttp
    #async with httpx.AsyncClient as session:  #use httpx
        await asyncio.gather(*[call_url(session) for x in range(i)])
    print(f'{i} call(s) in {time.time() - start} seconds')

This results in an expected response time profile:

1 call(s) in 7.9129478931427 seconds
2 call(s) in 8.876991510391235 seconds
3 call(s) in 9.730034589767456 seconds
4 call(s) in 10.630006313323975 seconds

However, if I uncomment async with httpx.AsyncClient as session: #use httpx and comment out async with aiohttp.ClientSession() as session: #use aiohttp (to swap in httpx for aiohttp) then I get the following error:

AttributeError                            Traceback (most recent call last)
<ipython-input-108-25244245165a> in async-def-wrapper()
     17         await asyncio.gather(*[call_url(session) for x in range(i)])
     18     print(f'{i} call(s) in {time.time() - start} seconds')

AttributeError: __aexit__

In my research online, I could only find this one Medium article by Simon Hawe showing how to use httpx for parallel request. See https://medium.com/swlh/how-to-boost-your-python-apps-using-httpx-and-asynchronous-calls-9cfe6f63d6ad

However, the example async code doesn't even use an async session object so I was a bit suspicious just to start. The code does not execute in either a Python 3.7.7 environment or in Jupyter. (Code is here: https://gist.githubusercontent.com/Shawe82/a218066975f4b325e026337806f8c781/raw/3cb492e971c13e76a07d1a1e77b48de94aa7229c/concurrent_download.py)

It results in this error:

Traceback (most recent call last):
  File ".\async_http_test.py", line 24, in <module>
    asyncio.run(download_all_photos('100_photos'))
  File "C:\Users\stborg\AppData\Local\Continuum\anaconda3\envs\fastai2\lib\asyncio\runners.py", line 43, in run
    return loop.run_until_complete(main)
  File "C:\Users\stborg\AppData\Local\Continuum\anaconda3\envs\fastai2\lib\asyncio\base_events.py", line 587, in run_until_complete
    return future.result()
  File ".\async_http_test.py", line 16, in download_all_photos
    resp = await httpx.get("https://jsonplaceholder.typicode.com/photos")
TypeError: object Response can't be used in 'await' expression

I'm clearly doing something wrong, as httpx is built for async. I'm just not sure what it is!

Steven Borg
  • 641
  • 1
  • 6
  • 14

2 Answers2

12

OK. This is frankly embarrassing. There is no need for a workaround. In the problem statement I completely neglected to call the AsyncClient constructor... I cannot believe I missed that for so long. Oh, my...

To fix, simply add the missing parenthesis to the AsyncClient constructor:

    async with httpx.AsyncClient() as session:  #use httpx
        await asyncio.gather(*[call_url(session) for x in range(i)])
Steven Borg
  • 641
  • 1
  • 6
  • 14
5

While experimenting further in writing this question, I discovered a subtle difference in the way httpx and aiohttp treat context managers.

In the code that introduces the question, the following code worked with aiohttp:

    async with aiohttp.ClientSession() as session: #use aiohttp
        await asyncio.gather(*[call_url(session) for x in range(i)])

This code passes the ClientSession context as a paramater to the call_url method. I assume that after asyncio.gather() completes, then resources are cleaned up as per the normal with statement.

However, the same approach with httpx fails, as above. This can be easily fixed, however, by simply avoiding the with statement altogether, and manually closing the AsyncClient.

In other words, replace

    async with httpx.AsyncClient as session:  #use httpx
        await asyncio.gather(*[call_url(session) for x in range(i)])

with

    session = httpx.AsyncClient() #use httpx
    await asyncio.gather(*[call_url(session) for x in range(i)])
    await session.aclose()

to fix the issue.

Here's the working code in its entirety:

import aiohttp
import asyncio
import time
import httpx

async def call_url(session):
    url = "https://services.cancerimagingarchive.net/services/v3/TCIA/query/getCollectionValues"
    response = await session.request(method='GET', url=url)
    return response

for i in range(1,5):
    start = time.time() # start time for timing event
    #async with aiohttp.ClientSession() as session: #use aiohttp
    session = httpx.AsyncClient() #use httpx
    await asyncio.gather(*[call_url(session) for x in range(i)])
    await session.aclose()
    print(f'{i} call(s) in {time.time() - start} seconds')
Steven Borg
  • 641
  • 1
  • 6
  • 14
  • 1
    Why do u import `aiohttp` in the last code example? – Dominux Oct 09 '20 at 20:20
  • 1
    Your final code returns `SyntaxError`. It should be like this: ` for i in range(1,5): start = time.time() # start time for timing event session = httpx.AsyncClient() #use httpx asyncio.gather(*[call_url(session) for x in range(i)]) print(f'{i} call(s) in {time.time() - start} seconds') ` – Ruben García Tutor Nov 18 '20 at 09:59