I recently started digging deeper in asynchronous code with Python, and am wondering why asyncio.sleep
is so important.
Use Case
- I have a synchronous source of data coming from a microphone every
x
milliseconds. - I check if there is a wakeword and if yes open a connection through a websockets to my server.
- I send / receive message asynchronously and independently.
My ideal implementation is that as soon as a message is ready it is sent, and as soon as a message is received it is processed.
This must be efficient, since we want to go down to x = 20ms
(frames from microphone received every 20 ms).
Implementation
The code is the following:
- It has a consumer / producer approach: Consumer receives messages, Producer sends messages.
- The frames from the microphone are put in a synchronous queue
- The producer / consumer are handled in a different thread
- The Queue is shared between the main thread and the other one. As soon as a new message is put, it will be processed on the other end.
import asyncio
import msgpack
import os
import pyaudio
import ssl
import websockets
from threading import Thread
from queue import Queue
from dotenv import load_dotenv
# some utilities
from src.utils.constants import CHANNELS, CHUNK, FORMAT, RATE
from .utils import websocket_data_packet
load_dotenv()
QUEUE_MAX_SIZE = 10
MY_URL = os.environ.get("WEBSOCKETS_URL")
ssl_context = ssl.SSLContext()
class MicrophoneStreamer(object):
"""This handles the microphone and yields chunks of data when they are ready."""
chunk: int = CHUNK
channels: int = CHANNELS
format: int = FORMAT
rate: int = RATE
def __init__(self):
self._pyaudio = pyaudio.PyAudio()
self.is_stream_open: bool = True
self.stream = self._pyaudio.open(
format=self.format,
channels=self.channels,
rate=self.rate,
input=True,
frames_per_buffer=self.chunk,
)
def __iter__(self):
while self.is_stream_open:
yield self.stream.read(self.chunk)
def close(self):
self.is_stream_open = False
self.stream.close()
self._pyaudio.terminate()
async def consumer(websocket):
async for message in websocket:
print(f"Received message: {msgpack.unpackb(message)}")
async def producer(websocket, audio_queue):
while True:
print("Sending chunck")
chunck = audio_queue.get()
await websocket.send(msgpack.packb(websocket_data_packet(chunck)))
# THE FOLLOWING LINE IS IMPORTANT
await asyncio.sleep(0.02)
async def handler(audio_queue):
websocket = await websockets.connect(MY_URL, ssl=ssl_context)
async with websockets.connect(MY_URL, ssl=ssl_context) as websocket:
print("Websocket opened")
consumer_task = asyncio.create_task(consumer(websocket))
producer_task = asyncio.create_task(producer(websocket, audio_queue))
done, pending = await asyncio.wait(
[consumer_task, producer_task],
return_when=asyncio.FIRST_COMPLETED,
timeout=60,
)
for task in pending:
task.cancel()
# TODO: is the following useful?
await websocket.close()
def run(audio_queue: Queue):
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
loop.run_until_complete(handler(audio_queue))
loop.close()
def main():
audio_queue = Queue(maxsize=5)
# the iterator is synchronous
for i, chunk in enumerate(MicrophoneStreamer()):
print("Iteration", i)
# to simulate condition wakeword detected
if i == 2:
thread = Thread(
target=run,
args=(audio_queue,),
)
thread.start()
# adds to queue
if audio_queue.full():
_ = audio_queue.get_nowait()
audio_queue.put_nowait(chunk)
if __name__ == "__main__":
main()
Issue
There is a line that I commented # THE FOLLOWING LINE IS IMPORTANT
in the producer.
If I do not add asyncio.sleep(...)
in the producer, the messages from the consumer are never received.
When I add asyncio.sleep(0)
in the producer, the messages from the consumer are received, but very late and sporadically.
When I add asyncio.sleep(0.02)
in the producer, the messages from the consumer are received on time.
Why is there this behavior and how to solve it? In order to send message every 20 milliseconds, I cannot sleep 20ms every iteration, this would probably mess up the process.
(Note, I found out this sleep fix with this issue)
What I tried
I thought that if the iterator was asynchronous, this would solve the issue, but it didn't. If you want to see the implementation, I opened another thread in the past days here.
I also tried to dig deeper into how event loops work. From my understanding, the asyncio.sleep
is necessary for the event loop to decide which task to execute, and to switch between them - for instance, we use it to trigger a task to start, after creating it.
This seems a bit odd to me. Is there a workaround?