1

I'm currently trying to setup a remote cluster on a group of servers I own using the ipyparallel library. I figured that if I share the $IPYTHONDIR between all ipcontrollers, ipengines and notebook that everything would just connect and work, but this is not the case for my current setup.

What I'm attempting to accomplish is such that a ipcontroller and ipengines are sitting on my cluster waiting for a jupyter notebook to connect to the controller and use it for it's cluster computing resources.

Currently I cannot get my notebook to connect to my controller even though all ports are open, the servers are directly accessible, and the IPYTHONDIR is shared.

When I open my notebook and go to the clusters tab I see my parallel profile, but it's not started. Which is odd because the ipcontroller and ipengines are already started and waiting for a connection from the notebook.

This boils down to:

  • Is it possible to run a notebook on a different server than the ipcontroller?
  • If the above is possible, why can I not get the notebook to connect to the cluster, and instead when I click start on the profile it simply makes a local cluster.

Thanks!

Jeff
  • 873
  • 2
  • 10
  • 16

1 Answers1

0

Yes this is possible if the notebook kernel is running on the same server as the ipcontroller. The notebook itself can be displayed from any browser. I use that functionality regularly.

The way I have done it is to have an ipython profile available on the server. In my case it's a Windows server and the profiles are set up under c:\users\<user>\.ipython\. In this case the profile folder is called profile_my32bitcluster and when I am creating the client, I specify the profile to use:

from ipyparallel import Client

rc = Client(profile='my32bitcluster')
dview = rc[:]

# Test it by pushing out a dataframe across some engines, modifying it
# and returning the modified dataframes...
df = pd.DataFrame(data={'x':[1,2,3,4,5], 'y':[1,4,9,16,25]})

dview.push({'df':df})

def myfunc(x):
    import sys
    import os
    import pandas as pd
    global df
    df['z'] = df['x'] * x
    return df

results = dview.map_sync(myfunc, [2,3,4])

I hope that helps.

Adrian Mc
  • 148
  • 9