0

I have a synchronous iterator, that comes from third party package. The iterator queries external service and yields some data. If there is no data, the iterator waits for it. I subclassed the WebSocketEndpoint from Starlette to send the new data from iterator through websocket. Unfortunately, it seems like I don't understand something and my code doesn't work as expected. Here is a slightly simplified code:

import time

from starlette.endpoints import WebSocketEndpoint
from starlette.websockets import WebSocket


class Iterator:
    """This is a third-party object, not asynchronous at all."""

    def __init__(self):
        self._stop = False

    def __iter__(self):
        self.i = 0
        return self

    def __next__(self):
        if self._stop:
            raise StopIteration

        time.sleep(5)
        self.i += 1
        print(self.i)
        return self.i

    def cancel(self):
        self._stop = True


class MyWebSocket(WebSocketEndpoint):
    def __init__(self, scope, receive, send) -> None:
        super().__init__(scope, receive, send)

        self.iterator = Iterator()

    async def on_connect(self, websocket: WebSocket) -> None:
        await super().on_connect(websocket)

        for message in self.iterator:
            await websocket.send_json({"message": message})

    async def on_disconnect(self, websocket: WebSocket, close_code: int) -> None:
        await super().on_disconnect(websocket, close_code)

        self.iterator.cancel()

First problem - the code doesn't send any data through websocket. The print statement indicates, that the iterator produces data, but nothing is actually sent. If I'll put return after websocket.send_json(), it will send the first result from the iterator correctly, but the loop will finish afterward. Why?

Another problem is that the iterator completely blocks application execution. I understand why it happens, but since it is a web service and the iterator is meant to work until the client disconnects from the websocket it can easily block my whole application. If I'll have 10 workers, 10 websocket clients will block the application and it won't be possible to do anything until one of the websockets will disconnect. How can I resolve it?

Djent
  • 2,877
  • 10
  • 41
  • 66

1 Answers1

0

This is a third-party object, not asynchronous at all.

And therein lies the problem - asyncio is single-threaded, so your iterator must either not block at all (like when iterating over a built-in collection), or you must use an async iterator and the async for loop which will suspends its execution while awaiting the next item.

When dealing with a third-party blocking function, you can incorporate it into async code using run_in_executor which will submit the function to a thread pool and suspend the current coroutine until the function completes. You cannot pass an iterator to run_in_executor directly, but you create a wrapper that takes a sync iterator and runs each individual invocation of __next__ through run_in_executor, providing the interface of an async iterator. For example:

async def wrap_iter(iterable):
    loop = asyncio.get_event_loop()
    it = iter(iterable)

    DONE = object()
    def get_next_item():
        # Get the next item synchronously.  We cannot call next(it)
        # directly because StopIteration cannot be transferred
        # across an "await".  So we detect StopIteration and
        # convert it to a sentinel object.
        try:
            return next(it)
        except StopIteration:
            return DONE

    while True:
        # Submit execution of next(it) to another thread and resume
        # when it's done.  await will suspend the coroutine and
        # allow other tasks to execute while waiting.
        next_item = await loop.run_in_executor(None, get_next_item)
        if next_item is DONE:
            break
        yield next_item

Now you can replace for message in self.iterator with async for message in wrap_iter(self.iterator), and everything should work.

user4815162342
  • 141,790
  • 18
  • 296
  • 355