0

I have 2 computers, both of which have the pathos Python module. I have a Pathos multiprocessing pool and have been trying to get pathos to split the number of processes evenly between the two CPUs using the following code:

from pathos.multiprocessing import ProcessPool
ngramPool = ProcessPool()
ngramPool.ncpus = 8
ngramPool.servers = ('localhost:5653','ec2-18-223-23-82.us-east-2.compute.amazonaws.com:5653')
questionNgrams = []
i = 0
previousI = 0
previousTime = time.time()
#Test questions
#questions = ["To whom do I owe this great pleasure","Who do I owe this great pleasure which is a great pleasure to","Who do I owe this great pleasure to"]
questionNgrams = ngramPool.map(n_gram.stringToNgrams,questions)

However, instead of running 4 processes on my local CPU and 4 on the Amazon EC2 instance, all 8 processes are being run on my local processor. How do I set up pathos so that it spawns 4 processes on my CPU and another 4 on the Amazon instance?

Montana Burr
  • 269
  • 4
  • 21
  • It appears that pathos defaults to using an RPC connection, which Amazon AWS does not currently have. So, I might need to figure out how to get it to use SSH instead. – Montana Burr Aug 25 '18 at 19:20
  • One of the most basic features of pathos is the ability to configure and launch a RPC-based service on a remote host. – Montana Burr Aug 25 '18 at 19:21
  • So, first I need to get preserver running on the server. That seems to be as simple as running ppserver. – Montana Burr Aug 25 '18 at 19:41

1 Answers1

2

I'm the pathos author. Working with distributed resources isn't as straightforward as you might want. You are correct (in your comments) that pathos uses RPC-based (wrapped in SSH) connections. You are also correct in that you have to set up a ppserver on the remote host. If you need to make a ssh connection, then you can do that with the pathos_connect script (see associated documentation), or directly with code in the pathos.secure module. Note that you'll also need to make sure that you have a working ssh-agent and have set up ssh key-pair authentication (i.e. uses no passphrase after the initial connection).

Having said that, it's pretty difficult to specifically get 4 remote workers and 4 local workers -- as the ParallelPool is dynamically load balanced. Thus, if you have "quick" tasks to run, the vast majority, if not all, of the tasks will run locally as spinning up the connection and shipping the tasks and retrieving the results will take more time than just running the jobs locally. You can force tasks to run remotely by zeroing out (or seriously limiting) the ncpus locally available for the pool, but how many jobs run where will depend on an automated load balance of the number of locally available tasks, and some measure of the time it takes for an individual job to complete versus the time it takes to connect and run the jobs remotely.

Mike McKerns
  • 33,715
  • 8
  • 119
  • 139
  • Thinking more about specifically getting 4 nodes on each... I haven't tried it, but maybe you could set up a remote `ppserver` and a local `ppserver`, and set a ssh-tunnel connection to them both... and set your pool to use zero "local" `ncpus`. Hmm. – Mike McKerns Aug 27 '18 at 16:24
  • Unfortunately, I haven't found any tutorial for how to use pathos. Documentation is also incomplete. I kindly ask Mike (the author of pathos) to write a tutorial about it. – Mehdi Apr 28 '20 at 07:59
  • @Mehdi: noted. `pathos` fundamentally provides an uniform interface for `multiprocess` and `ppft`. So, if you look at `multiprocessing` docs or Parallel Python docs, you'll get 90% or more of what you'd need for `pathos`. – Mike McKerns Apr 28 '20 at 13:30
  • I need it for distributed computing. I tried to set up a ssh tunnel (to a windows client), but pathos cannot connect. However, if it can, I don't know how to set up pathos server – Mehdi Apr 28 '20 at 14:24
  • 2
    @Mendi: Ah... the ssh tunnel, yes, that's central to `pathos` itself. I'd recommend looking at the docs in https://github.com/uqfoundation/pathos/blob/master/scripts/pathos_connect, which is used to make the tunnel connection, potentially to a pathos server. Or you can establish a tunnel a bit more manually, as in: https://github.com/uqfoundation/pathos/blob/master/examples/simple_tunnel.py (and also look at `secure_*py` for more `ssh` stuff). There's also an example how to use `pathos_connect` in https://github.com/uqfoundation/pathos/blob/master/examples/sum_primesX.py. – Mike McKerns Apr 28 '20 at 15:23
  • Been a while since I've connected to a windows server -- the last time I built a full windows setup, I believe, is here: https://github.com/mmckerns/tuthpc. I agree, a full tutorial would be nice. It's been on my todo list for a long time, unfortunately. – Mike McKerns Apr 28 '20 at 15:26
  • @Mark, finally I found out that I should run ppserver (https://softwarerecs.stackexchange.com/questions/38802/best-parallel-python-for-a-cluster); but there are two problems: 1) the ppserver is not in pathos library. 2) The ppserver available in pp library is old and its version does not match with pathos. System says PP version mismatch (local: pp-1.6.6.1, remote: pp-1.6.4.4) so I receive this error on server: RuntimeError: Socket connection is broken. There should be a ppserver alternative in pathos; but I don't know it. Please advise. – Mehdi Apr 28 '20 at 23:14
  • 1
    @Mehdi: (0) I'll assume you are addressing me, not Mark, (1) when you install `pathos`, it should install `ppserver` available as a command line script, (2) `pathos` uses `ppft`, as I mentioned above *not* `pp` -- so don't install `pp` it's broken install procedure can make it conflict with the newer `ppft`. If you have `setuptools`, then everything will be installed for you that you need, including the `ppserver` script. If you note the accepted answer to the OP's question, it states that you need to set up a `ppserver` and details what steps to take. – Mike McKerns Apr 29 '20 at 12:41
  • Thanks for the response. That was my bad called you Mark ! Oh... finally I found it. It's in Scripts folder. I was looking for it in the Lib\site-packages\pp folder. As I have installed in a conda env, python couldn't pick ppserver and that was why I couldn't run it. However, I managed to run the old ppserver by changing it's version to 1.6.6.1 !! :) Thanks Mike. – Mehdi May 02 '20 at 16:38
  • @Mehdi: No worries. Apologies about the lack of a comprehensive tutorial. – Mike McKerns May 03 '20 at 12:03
  • I will make one soon and share it here ;) – Mehdi May 04 '20 at 09:59