Detect stopped server process via rpyc.Connection

Question

Suppose I have a Service:

import rpyc

class MyService(rpyc.Service):
    my_dict = {}

    def exposed_put(self, key, val):
        MyService.my_dict[key] = val

    def exposed_get(self, key):
        return MyService.my_dict[key]

    def exposed_delete(self, key):
        del MyService.my_dict[key]

Now I start that Service running in a ThreadedServer:

from rpyc.utils.server import ThreadedServer
server = ThreadedServer(MyService, port=8000)
server.start()

Now in a different process on the same machine, I open a new Connection to the server:

import rpyc
c = rpyc.connect('localhost', 8000)

... but before accessing the root of the connection, the server process stops for some reason, such as Ctrl-Z in the controlling terminal of the server process. Now when I try to access the root via:

c.root

... Python hangs. Ctrl-C on the client side shows this:

In [31]: c.root
^C---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
<ipython-input-31-856a441cc51a> in <module>()
----> 1 c.root

/home/mack/anaconda/lib/python2.7/site-packages/rpyc/core/protocol.pyc in root(self)
    465         """Fetches the root object (service) of the other party"""
    466         if self._remote_root is None:
--> 467             self._remote_root = self.sync_request(consts.HANDLE_GETROOT)
    468         return self._remote_root
    469

/home/mack/anaconda/lib/python2.7/site-packages/rpyc/core/protocol.pyc in sync_request(self, handler, *args)
    436         seq = self._send_request(handler, args)
    437         while seq not in self._sync_replies:
--> 438             self.serve(0.1)
    439         isexc, obj = self._sync_replies.pop(seq)
    440         if isexc:

/home/mack/anaconda/lib/python2.7/site-packages/rpyc/core/protocol.pyc in serve(self, timeout)
    385                   otherwise.
    386         """
--> 387         data = self._recv(timeout, wait_for_lock = True)
    388         if not data:
    389             return False

/home/mack/anaconda/lib/python2.7/site-packages/rpyc/core/protocol.pyc in _recv(self, timeout, wait_for_lock)
    342             return None
    343         try:
--> 344             if self._channel.poll(timeout):
    345                 data = self._channel.recv()
    346             else:

/home/mack/anaconda/lib/python2.7/site-packages/rpyc/core/channel.pyc in poll(self, timeout)
     41     def poll(self, timeout):
     42         """polls the underlying steam for data, waiting up to *timeout* seconds"""
---> 43         return self.stream.poll(timeout)
     44     def recv(self):
     45         """Receives the next packet (or *frame*) from the underlying stream.

/home/mack/anaconda/lib/python2.7/site-packages/rpyc/core/stream.pyc in poll(self, timeout)
     39             while True:
     40                 try:
---> 41                     rl, _, _ = select([self], [], [], timeout)
     42                 except select_error as ex:
     43                     if ex[0] == errno.EINTR:

KeyboardInterrupt:

So it appears a call to Stream.poll ends up in an infinite loop if the server process is stopped but still connected (underlying socket is still open). Am I correct in thinking this is an unexpected case in the Stream implementation? I am using version 3.3.0. How might I detect this case and avoid the client hanging?

Steve Barnes · Answer 1 · 2015-12-20T22:39:55.830

2

If there is a risk of the server being haulted you can just check the value of c.closed you could also provide, on your client side a callback function to notify you that it has closed and pass it to your initialiser, with the name on_exit possibly, then register it with atexit.

To deal with the remote possibility that the server is paused, or less remote, is busy you would have to implement a heartbeat, i.e. a periodic callback to inform the client that the server is available.

edited Dec 20 '15 at 22:39

answered Dec 20 '15 at 06:56

Steve Barnes

27,618
6
63
73

You're confusing the effects of stopped and terminated. If the server process terminates, it is actually much easier to detect. The socket will be disconnected on termination and raise a `socket.error`. This will be caught by `rpyc` and re-raised as a `EOFError`. Afterwards, any attempt to access the root will immediately raise an `EOFError`. – Mack Dec 20 '15 at 13:54
@Mack - So you are looking to detect when the server is __paused__ rather than terminated! – Steve Barnes Dec 20 '15 at 14:46
1

Paused, stopped, hanging, etc. Yep! I ran across this problem while trying to account for every type of fault I could think of. I was building a replicated, distributed, persistent hash map. I came to the conclusion that it would require a change to the source (or some monkey-patching as a temp fix). I figured I'd ask to make sure I wasn't missing something though. – Mack Dec 20 '15 at 15:47
I feel the behavior of the connection in case of a terminated server is giving me a hard time implementing a working solution. At least in my environment the termination of the server does not raise an 'on_disconnect' event in the client but will raise a timeout exception when the client requests a server task. In the client for each call I have first to check for an existing connection and then in addition have to have all calls to the server in a try/except clause. To prevent the client to hang in the server timeout I also had to implement a heartbeat mecanism. why it this not in the librar – juerg Mar 26 '19 at 08:00
@juerg For the simple reason that nobody has added it to the library - I would suggest raising a ticket to suggest it as an enhancement here (https://github.com/tomerfiliba/rpyc) and ideally implementing it and adding a pull request yourself if you feel up to it. – Steve Barnes Mar 28 '19 at 05:38

Detect stopped server process via rpyc.Connection

1 Answers1