2

I'm working with a cluster of Centos6.5 servers, where I have one head node and the rest are slave nodes. The nodes are connected through a switch on a local network 192.168.1.x that's not visible to the outside world.

I'm trying to use Python Dispy on the head node to distribute a Python script on the slave nodes. The slave nodes are all running dispynode.py and when I fire up the Dispy program on the client node the slaves running dispynode.py return "ignoring ping from 192.168.1.1". It then just hangs. Any ideas on why the slaves running dispynode are ignoring and not running the job?

Thanks!

Rich
  • 12,068
  • 9
  • 62
  • 94
Doug
  • 285
  • 2
  • 13
  • By the way, wasn't allowed to create a dispy tag, seems like that might be helpful. – Doug Jan 07 '15 at 19:08
  • 1
    Any chance the dispy versions are different between the client and slaves? – Rich Jan 07 '15 at 19:15
  • Hey @Rich that's a good call. I just dug through dispynode.py source and saw that there is a try/except block where if the versions don't match it tosses the "ignore ping" message. Wouldn't it be helpful if the error message mentioned why it was being raised, lol. Oh well, Thanks! – Doug Jan 07 '15 at 19:22

1 Answers1

0

The answer is as @Rich mentioned above, versions must be the same. Dispy does not return a very helpful error message when client nodes and server nodes have different versions. They must have the same version number in order to communicate properly. I found this in the source code for dispynode.py:

try:
                info = unserialize(msg[len('PING:'):])
                assert info['version'] == _dispy_version
                if info['ip_addr'] is None:
                    addr = (addr[0], info['port'])
                else:
                    addr = (info['ip_addr'], info['port'])
except:
                logger.debug('Ignoring ping message from %s (%s)', addr[0], addr[1])
                continue

Note the assert version line.

Doug
  • 285
  • 2
  • 13