13

I'm building an application using gevent. My app is getting rather big now as there are a lot of jobs being spawned and destroyed. Now I've noticed that when one of these jobs crashes my entire application just keeps running (if the exception came from a non main greenlet) which is fine. But the problem is that I have to look at my console to see the error. So some part of my application can "die" and I'm not instantly aware of that and the app keeps running.

Jittering my app with try catch stuff does not seem to be a clean solution. Maybe a custom spawn function which does some error reporting?

What is the proper way to monitor gevent jobs/greenlets? catch exceptions?

In my case I listen for events of a few different sources and I should deal with each different. There are like 5 jobs extremely important. The webserver greenlet, websocket greenlet, database greenlet, alarms greenlet, and zmq greenlet. If any of those 'dies' my application should completely die. Other jobs which die are not that important. For example, It is possible that websocket greenlet dies due to some exception raised and the rest of the applications keeps running fine like nothing happened. It is completely useless and dangerous now and should just crash hard.

ritesh
  • 907
  • 3
  • 11
  • 31
Stephan
  • 3,679
  • 3
  • 25
  • 42
  • I've made Mission critical greenlets very very small.(8 lines of code) And they on their turn spawn greenlets for which it's 'ok' to crash. – Stephan Aug 08 '13 at 15:05

4 Answers4

12

I think the cleanest way would be to catch the exception you consider fatal and do sys.exit() (you'll need gevent 1.0 since before that SystemExit did not exit the process).

Another way is to use link_exception, which would be called if the greenlet died with an exception.

spawn(important_greenlet).link_exception(lambda *args: sys.exit("important_greenlet died"))

Note, that you also need gevent 1.0 for this to work.

If on 0.13.6, do something like this to kill the process:

gevent.get_hub().parent.throw(SystemExit())
Denis
  • 3,760
  • 25
  • 22
3

You want to greenlet.link_exception() all of your greenlets to a to janitor function.

The janitor function will be passed any greenlet that dies, from which it can inspect its greenlet.exception to see what happened, and if necessary do something about it.

Ivo
  • 5,378
  • 2
  • 18
  • 18
2

As @Denis and @lvo said, link_exception is OK, but I think there would be a better way for that, without change your current code to spawn greenlet.

Generally, whenever an exception is thrown in a greenlet, _report_error method (in gevent.greenlet.Greenlet) will be called for that greenlet. It will do some stuff like call all the link functions and finally, call self.parent.handle_error with exc_info from current stack. The self.parent here is the global Hub object, this means, all the exceptions happened in each greenlet will always be centralize to one method for handling. By default Hub.handle_error distinguish the exception type, ignore some type and print the others (which is what we always saw in the console).

By patching Hub.handle_error method, we can easily register our own error handlers and never lose an error anymore. I wrote a helper function to make it happen:

from gevent.hub import Hub


IGNORE_ERROR = Hub.SYSTEM_ERROR + Hub.NOT_ERROR


def register_error_handler(error_handler):

    Hub._origin_handle_error = Hub.handle_error

    def custom_handle_error(self, context, type, value, tb):
        if not issubclass(type, IGNORE_ERROR):
            # print 'Got error from greenlet:', context, type, value, tb
            error_handler(context, (type, value, tb))

        self._origin_handle_error(context, type, value, tb)

    Hub.handle_error = custom_handle_error

To use it, just call it before the event loop is initialized:

def gevent_error_handler(context, exc_info):
    """Here goes your custom error handling logics"""
    e = exc_info[1]
    if isinstance(e, SomeError):
        # do some notify things
        pass
    sentry_client.captureException(exc_info=exc_info)

register_error_handler(gevent_error_handler)

This solution has been tested under gevent 1.0.2 and 1.1b3, we use it to send greenlet error information to sentry (a exception tracking system), it works pretty well so far.

Reorx
  • 2,801
  • 2
  • 24
  • 29
0

The main issue with greenlet.link_exception() is that it does not give any information on traceback which can be really important to log.

For logging with traceback, I use a decorator to spwan jobs which indirect job call into a simple logging function:

from functools import wraps    

import gevent

def async(wrapped):

    def log_exc(func):

        @wraps(wrapped)
        def wrapper(*args, **kwargs):
            try:
                func(*args, **kwargs)
            except Exception:
                log.exception('%s', func)
        return wrapper

    @wraps(wrapped)
    def wrapper(*args, **kwargs):
        greenlet = gevent.spawn(log_exc(wrapped), *args, **kwargs)

    return wrapper

Of course, you can add the link_exception call to manage jobs (which I did not need)

Hadrien
  • 1,479
  • 2
  • 14
  • 18