0

Please, look at the code bellow (for sake of simplicity I am not using pydantic to group corutine, retries, timeouts):

import asyncio
import typing as tp
import random

async def my_func(wait_time: int) -> str:
    random_number = random.random()
    random_time = wait_time - random_number if random.random() < 0.5 else wait_time + random_number
    print(f"waiting for {wait_time}{random_time:+} seconds")
    await asyncio.sleep(wait_time)
    return f"waited for {wait_time}{random_time:+} seconds"

async def main() -> None:

    task1 = asyncio.create_task(my_func(wait_time=1), name='task1')
    task2 = asyncio.create_task(my_func(wait_time=2), name='task2')
    task3 = asyncio.create_task(my_func(wait_time=3), name='task3')

    task1_timeout = 1.2
    task2_timeout = 2.2
    task3_timeout = 3.2

    task1_retry = 4
    task2_retry = 3
    task3_retry = 2

    total_timeout = 5

    <what to put here?>

    return task1_result, task2_result, task3_result

asyncio.run(main())

As you can see I have function my_func (in real life I will have multiple different functions). In main() I have defined 3 tasks. Each task has its timeout and retry. For example, task1 has timeout 2 seconds and retry of 3 times.

Furthermore I have another (global) timeout, total_timeout that presents time in which main() must complete.

For example, if task1 start running and don't get result in 1.2 seconds, we should retry it up to 4 times, so in case in which we cannot get the result at all, we are still bellow timeout_total of 5 seconds.

For task2 that timeouts in 2.2 seconds and can be repeated 3 times, after second repeat is finished at 4.4 second, if we retry it again, it will be cut of by total_timeout at 5th second.

For task3 if we don't complete it in the first try, we don't have enough time for second try (total_timeout).

I would like to execute all three tasks concurrently, respecting their individual timeouts and retries, as well as total_timeout. At the end after up to 5 seconds I will get tuple of three elements that will be str (output of my_func) or None (in case all repeats failed, or task has been cut off by total_timeout). So output can be (str, str, str), (str, None, str) or (None, None, None).

Can someone provide some example code that would do what I have described?

user3225309
  • 1,183
  • 3
  • 15
  • 31
  • You need something like `await asyncio.gather(task1, task2, task3)`. That will return back the three results in order you pass in the awaitables. Keep in mind, though, that asyncio doesn't run things concurrently. It allows one task to run while one or more other tasks are waiting for I/O to complete. – dirn Sep 27 '22 at 18:38
  • gather doesn't have timeout at all – user3225309 Sep 27 '22 at 20:02
  • Instead of `create_task` you should use `wait_for`. It's pretty much the entire [timeouts section of the docs](https://docs.python.org/3/library/asyncio-task.html#timeouts). – dirn Sep 27 '22 at 21:35
  • Yes, it sounds easy. You have wait_for with timeout (but a single awaitable), you have wait with time out for multiple awaitables, you have gather with no timeout... a lot of options, but I haven't seen yet that someone provided a solution for what I have described. I think this is something that many people could benefit from. – user3225309 Sep 28 '22 at 15:21
  • Which of these have you tried? Did any of them work? If they didn't work, what was wrong with each version? – dirn Sep 28 '22 at 16:31
  • @dirn if you understood my question and know how to do it, you will need 15 minutes to provide an answer. I tried different techniques, but always got a partial result. – user3225309 Sep 28 '22 at 20:26

1 Answers1

1

I think this is a great question. I propose this solution which combines asyncio.gather() and asyncio.wait_for().

Here the third task is asked to wait 5 seconds with a 3.2 seconds timeout (retry 2 times), and will return None, as asyncio.TimeoutError will be raised (and caught).

import asyncio
import random
import sys


total_timeout = float(sys.argv[1]) if len(sys.argv) > 1 else 5.0


async def work_coro(wait_time: int) -> str:
    random_number = random.random()
    random_time = wait_time - random_number if \
        random.random() < 0.5 else wait_time + random_number

    if random_number > 0.7:
        raise RuntimeError('Random sleep time too high')

    print(f"waiting for {wait_time}{random_time:+} seconds")

    await asyncio.sleep(random_time)

    return f"waited for {wait_time}{random_time:+} seconds"


async def coro_trunner(wait_time: int,
                       retry: int,
                       timeout: float) -> str:
    """
    Run work_coro in a controlled timing environment

    :param int wait_time: How long the coroutine will sleep on each run
    :param int retry: Retry count (if the coroutine times out, retry x times)
    :param float timeout: Timeout for the coroutine
    """

    for attempt in range(0, retry):
        try:
            start_time = loop.time()
            print(f'{work_coro}: ({wait_time}, {retry}, {timeout}): '
                  'spawning')

            return await asyncio.wait_for(work_coro(wait_time),
                                          timeout)
        except asyncio.TimeoutError:
            diff_time = loop.time() - start_time
            print(f'{work_coro}: ({wait_time}, {retry}, {timeout}): '
                  f'timeout (diff_time: {diff_time}')
            continue
        except asyncio.CancelledError:
            print(f'{work_coro}: ({wait_time}, {retry}, {timeout}): '
                  'cancelled')
            break
        except Exception as err:
            # Unknown error raised in the worker: give it another chance

            print(f'{work_coro}: ({wait_time}, {retry}, {timeout}): '
                  f'error in worker: {err}')
            continue


async def main() -> list:
    tasks = [
        asyncio.create_task(coro_trunner(1, 2, 1.2)),
        asyncio.create_task(coro_trunner(2, 3, 2.2)),
        asyncio.create_task(coro_trunner(5, 5, 5.2))
    ]

    try:
        gaf = asyncio.gather(*tasks)
        results = await asyncio.wait_for(gaf,
                                         total_timeout)
    except (asyncio.TimeoutError,
            asyncio.CancelledError,
            Exception):
        # Total timeout reached: get the results that are ready

        # Consume the gather exception
        exc = gaf.exception()  # noqa

        results = []

        for task in tasks:
            if task.done():
                results.append(task.result())
            else:
                # We want to know when a task yields nothing
                results.append(None)
                task.cancel()

    return results


print(f'Total timeout: {total_timeout}')


loop = asyncio.get_event_loop()
start_time = loop.time()

results = asyncio.run(main())
end_time = loop.time()

print(f'{end_time - start_time} --> {results}')
cipres
  • 106
  • 3
  • in async def work_coro, line await asyncio.sleep(wait_time) should be actually await asyncio.sleep(random_time), isn't it? – user3225309 Sep 28 '22 at 20:35
  • Problem with your approach is that if task3 is timeouts, you will end up in "timeout on main()" and get nothing as a result, even task1 and task2 completed successfully. I set total_timeout to 5 and got a single spawning for (1, 2, 1.2) and single spawning for (1, 3, 2.2) which means they completed in the first run, but due to task3 timeout I got nothing as a result of main(). I would like to get everything that has been completed and None for everything else. – user3225309 Sep 28 '22 at 20:51
  • Indeed indeed, i'm high, i just got it working as you just described it, i'll post the edit in a bit, thanks – cipres Sep 28 '22 at 21:10
  • Edited the post: the new version can be run with the timeout on the command line (5 seconds otherwise). It works on python 3.7, 3.8, don't know why but with 3.9 TimeoutError is not fired for the main coroutine. – cipres Sep 28 '22 at 23:45
  • I am using python 3.9.14. In your code I added: import datetime as dt, start_time = dt.datetime.now() (before try: where you are setting the loop, and print(f"lasted: {round((dt.datetime.now()-start_time).total_seconds(),1)}") before last if. For default timeout of 5 seconds, script lasted 17.6 seconds which is far away from 5 seconds limit. In three tries I always get 17.6 seconds. In python 3.10.4 I got times: 17.6, 13.3, 17.6. – user3225309 Sep 29 '22 at 06:58
  • In python 3.9 I also got:
    _GatheringFuture exception was never retrieved
    future: <_GatheringFuture finished exception=CancelledError()>
    asyncio.exceptions.CancelledError
    – user3225309 Sep 29 '22 at 07:02
  • Works on any version now. Had to break out of *coro_trunner* when CancelledError is received. – cipres Sep 29 '22 at 12:25
  • please give to me some time to test it thoroughly. At first looks very promising. :) – user3225309 Sep 29 '22 at 17:41
  • queston 2: I am recording start time before try/except with loop definition starts, and end time after that try/except block. If task3 is set as (5, 5, 5.2), and all three tasks pass in the first try, how come that execution lasts for 4.3 second? It should last at least 5 seconds due to task3 definition, isn't it? I would like to provide to you an output here but there is no room. :( – user3225309 Sep 29 '22 at 19:42
  • Look at work_coro :) the coro will sleep less time (or more time) if the random number generated is < 0.5 ! So for example random_time might be 4.5 or 5.3 secs (so every run is different) even if the original wait_time passed was 5. And therefore, the coroutine will exit earlier (if random_tiime < wait_time), and gather() has all it asked for, and main() is finished and sets the fresults Future. So in this case, the maximum run time of main() is equal to the longest coroutine timeout value (the now famous *random_time*) – cipres Sep 29 '22 at 20:46
  • s/coroutine timeout value/coroutine sleep time/ – cipres Sep 29 '22 at 20:55
  • Yes, sure. Sorry for stupid question 2. :D – user3225309 Oct 02 '22 at 09:44
  • I think that we can avoid usage of fresults = asyncio.Future(). From main you can just return results list. In main we should convert tuple in try block to list. – user3225309 Oct 02 '22 at 09:47
  • Also in try: loop = ... you have loop.run_until_complete(asyncio.wait_for(main(), timeout=total_timeout)) as well as except timeout/cancelled. Then in main() you have results = await asyncio.wait_for(gaf, total_timeout), so again wait_for and total_timeout, as well as except timeout/cancelled. Do we need in try: loop = wait_for/timeout or we can just loop.run_until_complete(main())? What do you think? – user3225309 Oct 02 '22 at 10:56
  • Posted a simplified version with no intermediary Future. Thank you :) – cipres Oct 03 '22 at 15:35
  • My friend, what you have shown here is remarkable asyncio knowledge. I still have some remarks, but please consider them as a way to improve this answer, not complaints. Having said that, I have put in in work coro function: if random_number > 0.7: raise RuntimeError() in which case I entered in main() except Exception and got: "UnboundLocalError: local variable 'results' referenced before assignment" error. If you put results = [] before try it will work. – user3225309 Oct 06 '22 at 17:33
  • Also, in case of raising an error in work_coro, it will propagate to coro_runner, than main, in which case we would return nothing (actually empty list if you use my suggestion from comment above), but some of tasks may have completed normally. What do you think? – user3225309 Oct 06 '22 at 17:42
  • Thank you (i work on https://galacteek.gitlab.io and it's an asyncio battlefield ...). Yeah indeed, *coro_trunner* should handle the exceptions coming from *work_coro*, i'll post an update tomorrow. – cipres Oct 07 '22 at 20:03
  • Validate the answer maybe ? – cipres Mar 19 '23 at 12:15
  • your answer is surely the best so far. Thanks again. – user3225309 Mar 20 '23 at 23:45