3

I've got an external library in C++ that has been wrapped by Cython. This C++ library itself I cannot change. I would like to combine the library to be used as part of Python application that uses asyncio as its primary process control.

The Cython library essentially does network work with a proprietary protocol. The Cython library however is blocking when the event handler for the library is started in python. I've gotten it to a stage where I can pass a Python function and receive callbacks for events received from the C++ library. I can resolve the library hanging the application at the library event handler if I run the event handler within event_loop.run_in_executor.

My question is, how can I best model this to work with asnycio that fits well with its interfaces rather than hack up ad hoc solutions to use the Cython library methods? I had a look into writing this as a asyncio.Protocol and asyncio.Transport that then uses the Cython library as it's underlying communication mechanism. However, it looks like it's a lot of effort with some monkey patching to make it look like a socket. Is there a better way or abstraction to put a wrapper on external libraries to make it work with asyncio?

0x00
  • 141
  • 2
  • 10
  • AFAIK the C++ library will need to support non-blocking calls for this to be an option. Ie, if you set its socket to be nonblocking - `setblocking(0)`, does it propagate EAGAIN errors to the client? If so, it can be used in an asyncio loop like any other non-blocking library. Otherwise since it's native code that you cannot change, not likely. – danny Jul 20 '17 at 15:18
  • danny I think you might be right. Though I have the immediate thread hanging issues resolved, the scheduling performance is very poor when using run_in_executor of a ThreadPoolExecutor. Setting debug for asnycio shows multiple seconds wait of different jobs in the event loop. Unless I've got another bug somewhere, it's likely to be from the thread processing the blocking code and just sitting there for a period of time before another asyncio task can finally be scheduled. – 0x00 Jul 21 '17 at 12:42
  • That said, I am aware of one third party module that claims to be able to patch native code extensions and make then asynchronous - [greenify](https://github.com/douban/greenify). It will only work with gevent though as it uses hooks specifically for it. Give it a go, would be interested to know if it helps. I can post an example if the shared library is available publicly. – danny Jul 21 '17 at 12:57
  • @danny I appreciate the offer, but I need to use asncyio for this one. The approach they take is quite interesting however. Thanks for that! I'll look into it a bit more for another task I've got. – 0x00 Jul 25 '17 at 09:52

1 Answers1

5

To answer my own question, as far as I can see there is no obligations to use abstractions provided by Protocol or Transport in asyncio for structuring applications. The best modeling for this I found is to use a regular class with its methods defined as async. The class then can be made to look like whatever pattern fits your requirement. This is especially relevant if the code you are wrapping doesn’t have same overall use case as a socket. The asyncio provided abstractions themselves are pretty barebones. For things that are complicated like Cython wrapped C++ blocking code, you will need to deal with it with multiprocessing. This is to avoid hanging the interpreter. Asyncio does not make it possible to run blocking code without changes. The code must be specifically written to be asyncio compatible.

What I did was put the entire blocking code including the construction of the object into a function that was executed with event_loop.run_in_executor. In addition to this I used a unix socket to communicate with the process for commands and callback data. Due to using unix sockets you can use asnycio methods in your main application, same goes for pipes.

Here are some results I got from sending 128 bytes from the multiprocess Process producer to the asyncio main process. The data was generated at a 10-millisecond interval. The duration was timed using time.perf_counter(). Results below are in nanoseconds. The machine itself was Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz running Linux kernel 4.10.17.

Asyncio with uvloop

count   10001.000000
mean    76435.956504
std      8887.459462
min     63608.000000
25%     71709.000000
50%     74104.000000
75%     79496.000000
max    287204.000000

Standard Asyncio event loop

count   10001.000000
mean   199741.937506
std     27900.377114
min    173321.000000
25%    185545.000000
50%    191839.000000
75%    205279.000000
max    529246.000000
0x00
  • 141
  • 2
  • 10