0

I'm trying to implement the http/2 stack in my own app server which I've built from scratch using asyncio. As per my understanding, asyncio maintains a "tasks" queue internally which is used by the event loop to run tasks. Now, to implement stream prioritization I need to be able to run the high priority tasks for a longer duration than the low priority tasks (by tasks I'm thinking, the coroutine returned by a call to application(scope, receive, send) as per the ASGI spec.) I'm not able to find a way to prioritize this internal queue used by asyncio.
I even thought about capturing the event dicts I get as an argument to the send callable in application(scope, receive, send) but the asgi spec says "Protocol servers must flush any data passed to them into the send buffer before returning from a send call". What is meant by "send buffer" here? Is it the OS/Kernel send buffer?

Am I thinking about stream prioritization in the wrong sense? What would be a good approach to implement this?

class Worker(object):
  def get_asgi_event_dict(self, frame):
    event_dict = {
      "type": "http",
      "asgi": {"version": "2.0", "spec_version": "2.1"},
      "http_version": "2",
      "method": frame.get_method(),
      "scheme": "https",
      "path": frame.get_path(),
      "query_string": "",
      "headers": [
        [k.encode("utf-8"), v.encode("utf-8")] for k, v in frame.headers.items()
      ],
    }
    return event_dict  

  async def handle_request(self):
    try:
      while True:
        self.request_data = self.client_connection.recv(4096)
        self.frame = self.parse_request(self.request_data)
        if isinstance(self.frame, HeadersFrame):
          if self.frame.end_stream:
            current_stream = Stream(
              self.connection_settings,
              self.header_encoder,
              self.header_decoder,
              self.client_connection,
            )
            current_stream.stream_id = self.frame.stream_id
            asgi_scope = self.get_asgi_event_dict(self.frame)
            current_stream.asgi_app = self.application(asgi_scope)
            # The next line puts the coroutine obtained from a call to
            # current_stream.asgi_app on the "tasks" queue and hence
            # out of my control to prioritize.
            await current_stream.asgi_app(
              self.trigger_asgi_application, current_stream.send_response
            )
        else:
          self.asgi_scope = self.get_asgi_event_dict(self.frame)
    except Exception:
      print("Error occurred in handle_request")
      print((traceback.format_exc()))
class Stream(object):
  async def send_response(self, event):
    # converting the event dict into http2 frames and sending them to the client.
Akshay Takkar
  • 500
  • 1
  • 7
  • 21
  • It would be helpful if you post a working and a failing unit test, showing correct operation of a pair of equal priority tasks, and incorrect sequencing of hi/lo priority tasks. – J_H May 18 '19 at 18:49
  • @J_H I don't have a working example because I'm confused as to how to go about implementing it, but I'll put some code that I have right now for some reference – Akshay Takkar May 19 '19 at 07:21

1 Answers1

1

HTTP stream priorities operate at a different level from the asyncio run queue.

Asyncio primitives are inherently non-blocking. When asyncio is running its task queue, what kernel ends up seeing is a series of instructions what to do, such as "start connecting to A", "start writing X to B", and "continue writing Y to C". The order of these instructions within an iteration of the event loop is irrelevant, since the OS will execute them asynchronously anyway.

HTTP2 stream priorities play a role when multiplexing many streams over a single TCP connection, and this should be implemented in the layer that actually speaks HTTP2, such as aiohttp. For example, if you are an http server, the client may have requested the page and multiple images, all over a single TCP connection. When the socket is ready for writing, you get to choose which image to send (or continue sending), and that's where stream priorities come into play.

It's not about asyncio running tasks in a specific order, it's about asyncio primitives being used in the correct order.

user4815162342
  • 141,790
  • 18
  • 296
  • 355
  • But even if **await** for high priority tasks is called before low priority tasks (which is what I think you mean by "asyncio primitives being used in the correct order"), once all the tasks(high and low priority) are in the event loop, they will be scheduled by the scheduler randomly each time the **asgi_app** yields control back to the scheduler. So the priority of all tasks essentially becomes equal right? – Akshay Takkar May 20 '19 at 04:40
  • @AkshayTakkar No, because the last sentence refers to order of asyncio primitives which include synchronization, not to order of low-level kernel primitives within a single iteration pass. For example, if you put items in an asyncio queue "in correct order", this order *will* be respected, provided that the queue items are processed sequentially. Sequential order is ensured simply by using `await` in the right places, and that will ensure that some operations are not only **initiated**, but also **finished**, before others start. – user4815162342 May 20 '19 at 05:54
  • Ok, so the higher priority tasks will get a head start but that still means that each task gets the same amount of CPU time regardless of its priority, right? Instead what I want is the high priority tasks should run for longer or be scheduled more frequently so that at any given time the high priority tasks in the queue are being given a larger share of CPU time. – Akshay Takkar May 20 '19 at 14:54
  • @AkshayTakkar There is no CPU in asyncio tasks, they only service (non-blocking) IO and the necessary bookkeeping between them. Any CPU work must be done in a different thread, and only awaited by asyncio. (Prioritizing tasks as you wish might be best accomplished using multiprocessing and process priorities.) And HTTP2 priorities, as I understand them, are still about IO, not about CPU. – user4815162342 May 20 '19 at 18:07
  • Say for example, I have 5 tasks in my asyncio queue and lets say they all have completed processing their business logic and are ready to start sending data to the client, then each tasks gets the same amount of time to send data before the control is yielded back to the scheduler. So, eg: task 1 has 200 bytes to send and its run by the scheduler. It sends 50 bytes then yields control back to the scheduler. The scheduler then runs task 2, task 2 does the same, then task 1 is run again which sends another 50 bytes and so on till all data is sent. Is that right? – Akshay Takkar May 21 '19 at 14:22
  • Yes. The described scenario assumes that all relevant tasks have been started and are runnable. If you are implementing a priority/dependency system, you will have tasks competing for the same TCP stream `await` on the condition of there being no higher-priority task wanting to use the same stream. Once the system is set up like that, it's not a problem that all runnable task run in an undefined order, because runnable tasks will be those of the same priority (or of completely unrelated priorities) and, given that they only execute async operations, they won't block each other in any way. – user4815162342 May 21 '19 at 14:59