-1

I'm trying to run asyncio task concurrently on each worker thread of concurrent.futures Threadpool. However, I couldn't achieve the desired outcome.

async def say_after(delay, message):
    logging.info(f"{message} received")
    await asyncio.sleep(delay)
    logging.info(f"Printing {message}")

async def main():
    logging.info("Main started")
    await asyncio.gather(say_after(2, "TWO"), say_after(3, "THREE"))
    logging.info("Main Ended")

await main()

Output:

20:12:26:MainThread:Main started
20:12:26:MainThread:TWO received
20:12:26:MainThread:THREE received
20:12:28:MainThread:Printing TWO
20:12:29:MainThread:Printing THREE
20:12:29:MainThread:Main Ended

To summarize my understanding of above code, asyncio gather creates tasks and registers them on running event loop on MainThread. Unsurprisingly, it saves time as compared to synchronous code.

def say_after(delay, message):
    logging.info(f"{message} received")
    time.sleep(delay)
    logging.info(f"Printing {message}")

with cf.ThreadPoolExecutor(max_workers=3) as executor:
    results = [executor.submit(say_after, i+1, num_word_mapping[i+1]) for i in range(10)]

To summarize my understanding, cf threadpool creates three threads which are swapped preemptively by OS to achieve concurrency.

Output:

19:38:43:ThreadPoolExecutor-9_0:ONE received
19:38:43:ThreadPoolExecutor-9_1:TWO received
19:38:43:ThreadPoolExecutor-9_2:THREE received
19:38:44:ThreadPoolExecutor-9_0:Printing ONE
19:38:44:ThreadPoolExecutor-9_0:FOUR received
19:38:45:ThreadPoolExecutor-9_1:Printing TWO
19:38:45:ThreadPoolExecutor-9_1:FIVE received
19:38:46:ThreadPoolExecutor-9_2:Printing THREE
19:38:46:ThreadPoolExecutor-9_2:SIX received
19:38:48:ThreadPoolExecutor-9_0:Printing FOUR
19:38:48:ThreadPoolExecutor-9_0:SEVEN received
19:38:50:ThreadPoolExecutor-9_1:Printing FIVE
19:38:50:ThreadPoolExecutor-9_1:EIGHT received
19:38:52:ThreadPoolExecutor-9_2:Printing SIX
19:38:52:ThreadPoolExecutor-9_2:NINE received
19:38:55:ThreadPoolExecutor-9_0:Printing SEVEN
19:38:55:ThreadPoolExecutor-9_0:TEN received
19:38:58:ThreadPoolExecutor-9_1:Printing EIGHT
19:39:01:ThreadPoolExecutor-9_2:Printing NINE
19:39:05:ThreadPoolExecutor-9_0:Printing TEN

Now I want to run an event loop with multiple tasks on each worker thread. I tried below code but it didn't improve the execution time.

def say_after(delay, message):
    logging.info(f"{message} received")
    time.sleep(delay)
    logging.info(f"Printing {message}")

async def parallel(executor, delay, message):
    loop = asyncio.get_running_loop()
    loop.run_in_executor(executor, say_after, delay, message) 

async def main():
    executor = cf.ThreadPoolExecutor(max_workers=3)
    await asyncio.gather(*[parallel(executor, i+1, num_word_mapping[i+1])  for i in range(10)])

await main()

Output:

20:57:04:ThreadPoolExecutor-19_0:ONE received
20:57:04:ThreadPoolExecutor-19_1:TWO received
20:57:04:ThreadPoolExecutor-19_2:THREE received
20:57:05:ThreadPoolExecutor-19_0:Printing ONE
20:57:05:ThreadPoolExecutor-19_0:FOUR received
20:57:06:ThreadPoolExecutor-19_1:Printing TWO
20:57:06:ThreadPoolExecutor-19_1:FIVE received
20:57:07:ThreadPoolExecutor-19_2:Printing THREE
20:57:07:ThreadPoolExecutor-19_2:SIX received
20:57:09:ThreadPoolExecutor-19_0:Printing FOUR
20:57:09:ThreadPoolExecutor-19_0:SEVEN received
20:57:11:ThreadPoolExecutor-19_1:Printing FIVE
20:57:11:ThreadPoolExecutor-19_1:EIGHT received
20:57:13:ThreadPoolExecutor-19_2:Printing SIX
20:57:13:ThreadPoolExecutor-19_2:NINE received
20:57:16:ThreadPoolExecutor-19_0:Printing SEVEN
20:57:16:ThreadPoolExecutor-19_0:TEN received
20:57:19:ThreadPoolExecutor-19_1:Printing EIGHT
20:57:22:ThreadPoolExecutor-19_2:Printing NINE
20:57:26:ThreadPoolExecutor-19_0:Printing TEN

I expected to see faster execution times in code 4. However, I am not sure if I am doing it in the right way.

Environment: Python 3.7 (Jupyter Notebook)

  • Swap `ThreadPoolExecutor` for a `ProcessPoolExecutor`. See my answer here for a more complete answer https://stackoverflow.com/questions/56992595/how-can-i-launch-a-blocking-task-from-asyncio-asynchronously/57024393#57024393 – Tim Jul 14 '19 at 10:05

1 Answers1

1

Now I want to run an event loop with multiple tasks on each worker thread.

Concurrent workers are completely separate from the event loop. Each pool consists of a number of workers, and each worker can do one job at any given time. This functionality is provided by the concurrent.futures module and is completely orthogonal to asyncio.

So, when you use run_in_executor to access the thread pool, there is no reason that the code would magically become faster. After all, you are still executing 10 tasks on 3 workers, just like before. The only value run_in_executor has added is that now you can await those workers in your asyncio event loop.

To speed up the code, you need to either increase the number of workers, or stop using run_in_executor altogether and start using asyncio-native facilities, as in your first example.

user4815162342
  • 141,790
  • 18
  • 296
  • 355