28

I'm developing a real-time REST API using Python3+bottle/UWSGI. I was experiencing latencies in my code, sometimes 100s of ms, which does matter in my app.

Using logging module, I was trying to identify slow parts of my code, printing how long individual codeblocks took to run. I know this is an extremely poor way of profiling the code, but sometimes it is able to do the job quite well.

Even though I identified some slow parts, I was still missing something -- the individual parts seemed to take 10s of ms, but very often they took 100s of ms as the whole. After some increasingly insane experiments which drove me almost completely nuts, I've come to the following:

t = round(100*time.time())
logging.info('[%s] Foo' % t)
logging.info('[%s] Bar' % t)

Surprisingly, it gives:

2014-07-16 23:21:23,531  [140554568353] Foo
2014-07-16 23:21:24,312  [140554568353] Bar

Even though this seems hard to believe, there are two consequent logging.info() calls, and for some reason, there is a gap of almost 800 ms between them. Can anybody tell me what is going on? It is noteworthy that if there are multiple info() calls, the latency only appears once in the whole API method, most frequently at its very beginning (after first call). My only hypothesis are disk latencies, there are several (but not so many!) workers running simultaneously and I'm using rotational disk, no SSD. But I thought there are buffers and that the OS will do the disk-flush asynchronously for me. Am I wrong in my assumptions? Should I avoid logging completely to avoid the latencies?

EDIT

Based on Vinay Sajip's suggestion, I switched to the following initialization code:

log_que = queue.Queue(-1)
queue_handler = logging.handlers.QueueHandler(log_que)
log_handler = logging.StreamHandler()
queue_listener = logging.handlers.QueueListener(log_que, log_handler)
queue_listener.start()
logging.basicConfig(level=logging.DEBUG, format="%(asctime)s  %(message)s", handlers=[queue_handler])

It seems like it works fine (if I comment queue_listener.start(), there is no output), but the very same latencies still appear. I don't see how is it even possible, the call should be non-blocking. I also put gc.collect() at the end of each request to ensure the problem is not caused by the garbage collector -- without any effect. I've also tried to turn off the logging for the whole day. The latencies disappeared, so I think their source must be in the logging module...

Tregoreg
  • 18,872
  • 15
  • 48
  • 69
  • I would suspect a system problem and not a code problem. For instance if you're on ubuntu and have not defined a swap file or if you're on windows and you have less than 20GB on your hard drive, you'll experience loss of responsiveness – Jonathan Livni Jul 16 '14 at 21:53
  • If you can reduce it to a small self-contained script which shows the problem, please post that. Otherwise, it can be hard to see what's going on in a multi-threaded environment. Can you be sure that there are no synchronous handlers defined *anywhere* in your system (e.g. in third-party libraries)? That could affect the latencies. Libraries are not supposed to add handlers other than `NullHandler`, but developers don't always do things as they should. – Vinay Sajip Jul 19 '14 at 12:13
  • Also note, the `format=` parameter to `basicConfig()` only applies to handlers added in that call, i.e. to the `QueueHandler` in this case. That does not use a formatter, and the `StreamHandler` will not have a formatter set as it's only passed to the `QueueListener`. You would need to set a formatter on that explicitly. – Vinay Sajip Jul 19 '14 at 12:19
  • 1
    @VinaySajip You are absolutely right, it is caused by the UWSGI "magical" forking of processes. When I printed the handlers currently used, ``QueueHandler`` is gone and the default ``StreamHandler`` is present again. I'm now trying to use lazy initialization to create the handler AFTER the the process is forked. – Tregoreg Jul 22 '14 at 17:56

4 Answers4

25

You can use asynchronous handlers (QueueHandler and corresponding QueueListener, added in Python 3.2, and described in this post) and do the I/O processing of your logging events in a separate thread or process.

Vinay Sajip
  • 95,872
  • 14
  • 179
  • 191
  • 1
    Seems like it exactly solved my problem! I've just put it into production and the logging latencies seems gone. Moreover, I had to do only little programming to fix it. Thanks a lot! – Tregoreg Jul 18 '14 at 19:20
  • Oh no, it was too soon to exult, the very same latencies reappeared, so my problem remains unsolved :( – Tregoreg Jul 18 '14 at 20:23
  • 1
    Finally, I've got everything implemented correctly, including lazy initialization of the logger AFTER the process is forked, and the latencies are really gone. Your answer was right from the beginning, it was just the messing with UWSGI which caused all the complications. – Tregoreg Jul 22 '14 at 22:34
17

Try async logging provided by Logbook

As hasan has proposed, async log handler can be the way to go.

Recently I have tried using Logbook and can say, it will provide you all what you need for this - ZeroMQHandler as well as ZeroMQSubscriber

Paweł Polewicz
  • 3,711
  • 2
  • 20
  • 24
Jan Vlcinsky
  • 42,725
  • 12
  • 101
  • 98
  • 3
    Honestly, 4 years later, we're running everywhere on Logbook based on your original advice, just didn't manage to write it in here :) It works great and the issue never appeared anymore, so your answer was right! – Tregoreg Jun 29 '18 at 21:36
  • How does LogBook do async logging? I cannot find any information related to do that in their docs – Muhammad Sarim Mehdi Feb 15 '22 at 08:59
  • @SarimMehdi LogBook ZeroMQHandler sends the log record to ZeroMQ socket and proceeds, actual writting of the log happens (if ever) asynchronously - if you use ZeroMQSubscriber which collects the records and does something with them. Anyway, this async is by no means asyncio. – Jan Vlcinsky Feb 16 '22 at 14:43
7

First, start with an eviction queue (circle buffer) .... this ensures the queue handler cannot consume all available RAM.

class EvictQueue(Queue):
    def __init__(self, maxsize):
        self.discarded = 0
        super().__init__(maxsize)

    def put(self, item, block=False, timeout=None):
        while True:
            try:
                super().put(item, block=False)
            except queue.Full:
                try:
                    self.get_nowait()
                    self.discarded += 1
                except queue.Empty:
                    pass

Then replace all of the handlers at your root... whatever they are after normal configuration....

def speed_up_logs(): 
    rootLogger = logging.getLogger()     
    log_que = EvictQueue(1000)
    queue_handler = logging.handlers.QueueHandler(log_que)
    queue_listener = logging.handlers.QueueListener(log_que, *rootLogger.handlers)
    queue_listener.start()
    rootLogger.handlers = [queue_handler]

Effects:

  • Logging will be very fast

  • If you log faster than you can write to the drive, older unwritten entries will be silently discarded.

  • Might be nice to enhance this by logging a single entry with the number of discarded entries every minute or so (swapping discarded to zero).

Erik Aronesty
  • 11,620
  • 5
  • 64
  • 44
  • 2
    There's a `break` command missing after `super().put(item, block=False)` – Philippe Remy Jul 17 '21 at 13:46
  • Hey Erik, good work and well done!. **6** months ago, @PhilippeRemy highlighted a very important caveat in your work, yet you turned a blind eye to it. Can we know why, please? – Yahya Dec 01 '21 at 09:59
4

It might depend on the logging handler. My experience is that e.g. PostgreSQL used as a logging handler is a very poor choice for speed. FileHandler might give you very good results but if your system is very I/O heavy then even simple file writes can be slow. I suggest to use some asynchronous handler, e.g. send the logs via UDP to a dedicated process.

hasan
  • 638
  • 4
  • 14