3

I have a project that depends on Scrapy 2.3.0 which uses Twisted 20.3.0 as its network engine. I am trying to convert the callback based approach used by Scrapy to coroutines and run it with Python's asyncio. To make a HTTP request, one needs to create a new Request object which takes a callback that handles the sucess cases and a errback that handles the failures. So I wrote the following method to do that conversion:

import asyncio
from scrapy import http
# ...
async def request(
    self, url: str, method: str, *, headers: dict = None, cookies: dict = None,
    body: str = None, meta: dict = None
) -> http.Response:
    fut = asyncio.get_running_loop().create_future()
    def fire_success(response):
        if not fut.done():
            fut.set_result(response)
        return None
    def fire_failure(failure):
        if not fut.done():
            fut.set_exception(failure.value)
        return None
    scrapy_request = http.Request(
        url,
        method=method,
        headers=headers,
        cookies=cookies,
        body=body,
        meta=meta,
        callback=fire_success,
        errback=fire_failure,
        dont_filter=True
    )
    self._crawler.engine.crawl(scrapy_request, self._crawler.spider)
    return await fut

But when an exception is raised, I get just the traceback before return await fut, does this means that failure.value set in

def fire_failure(failure):
    if not fut.done():
       fut.set_exception(failure.value)
    return None

has no traceback?

The following example reproduces that problem:

import asyncio

async def a():
    await asyncio.sleep(2)
    raise Exception('Hello from a()')

async def b():
    await a()

async def c(fut: asyncio.Future):
    try:
        await b()
    except Exception as e:
        fut.set_exception(e.with_traceback(None))

async def d(fut):
    return await fut

async def main():
    fut = asyncio.get_running_loop().create_future()
    await asyncio.gather(d(fut), c(fut))

asyncio.run(main())

where I removed the traceback by calling e.with_traceback(None), this is the incomplete traceback (it's missing the call to await b()):

Traceback (most recent call last):
  File "tmp.py", line 23, in <module>
    asyncio.run(main())
  File ".../python/3.8.6/lib/python3.8/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File ".../python/3.8.6/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "tmp.py", line 21, in main
    await asyncio.gather(d(fut), c(fut))
  File "tmp.py", line 17, in d
    return await fut
Exception: Hello from a()
hldev
  • 914
  • 8
  • 18
  • Check the documentation of Failure: https://twistedmatrix.com/documents/current/api/twisted.python.failure.Failure.html – Gallaecio Nov 06 '20 at 17:01
  • @Gallaecio `getTraceback()` and `getBriefTraceback()` are undocummented. `getTracebackObject()` does not returns a `Traceback` object acceptable by `Exception.with_traceback()`. – hldev Nov 22 '20 at 19:33
  • Do not miss `getTracebackObject`. – Gallaecio Feb 09 '21 at 12:19
  • I have tried `Exception.with_traceback(failure.getTracebackObject())` but traceback returned by twisted is not compatible (Python rejects it with TypeError) as I said in the comment above – hldev Feb 24 '21 at 23:50

0 Answers0