Questions tagged [ipython-parallel]

Use this tag for questions related to IPython's architecture for parallel and distributed computing.

Quoting the IPython parallel overview:

IPython has a sophisticated and powerful architecture for parallel and distributed computing. This architecture abstracts out parallelism in a very general way, which enables IPython to support many different styles of parallelism . [...]. Most importantly, IPython enables all types of parallel applications to be developed, executed, debugged and monitored interactively. Hence, the I in IPython.

is used for all sort of questions that are engaged to using IPython's parallel capabilities.

191 questions
2
votes
0 answers

ipcluster command not creating full set of engines

Using Ubuntu 12.04 I am trying to set up a LAN cluster. The details: Controller Config # Configuration file for ipcontroller. c = get_config() c.IPControllerApp.reuse_files = True c.IPControllerApp.engine_ssh_server = u'bar@bar1' c.HubFactory.ip =…
phil0stine
  • 303
  • 1
  • 13
2
votes
1 answer

Execution Time Difference based on sum() on IPython

I'm doing a simple Monte Carlo simulation exercise, using ipcluster engines of IPython. I've noticed a huge difference in execution time based on how I define my function, and I'm asking the reason for this. Here are the details: When I definde the…
2
votes
2 answers

How to use IPython.parallel map() with generators as input to function

I am trying to use IPython.parallel map. The inputs to the function I wish to parallelize are generators. Because of size/memory it is not possible for me to convert the generators to lists. See code below: from itertools import product from…
Vincent
  • 1,579
  • 4
  • 23
  • 38
2
votes
2 answers

Real-time output from engines in IPython parallel?

I am running a bunch of long-running tasks with IPython's great parallelization functionality. How can I get real-time output from the ipengines' stdout in my IPython client? E.g., I'm running dview.map_async(fun, lots_of_args) and fun prints to…
rodion
  • 6,087
  • 4
  • 24
  • 29
2
votes
1 answer

Celery vs Ipython parallel

I have looked at the documentation on both, but am not sure what's the best choice for a given application. I have looked closer at celery, so the example will be given in those terms. My use case is similar to this question, with each worker…
phil0stine
  • 303
  • 1
  • 13
2
votes
1 answer

Parallelize function on dictionary in IPython

Up till now, I have parallelized functions by mapping them on to lists that are distributed out to the various clusters using the function map_sync(function, list) . Now, I need to run a function on each entry of a dictionary. map_sync does not…
Kao
  • 185
  • 1
  • 1
  • 8
2
votes
2 answers

Handle exceptions whilst waiting for next ipython parallel map result

I want to iterate over some asynchronous results from an ipython parallel map as they arrive. The only way I can find to do this is to iterate over the results object. However if one of the tasks raises an exception the iteration terminates. Is…
Epimetheus
  • 1,119
  • 1
  • 10
  • 19
1
vote
0 answers

IPyparallel: cannot remove cluster

I'm trying to use the function remove_cluster of ipyparallel.ClusterManager() import ipyparallel as ipp cluster = ipp.Cluster(n=2) # start cluster syncronously cluster.start_cluster_sync() #
user2314737
  • 27,088
  • 20
  • 102
  • 114
1
vote
2 answers

How do I use multiprocessing on Python to speed up a for loop?

I have this code which I would like to use multi-processing to speed up: matrix=[] for i in range(len(datasplit)): matrix.append(np.array(np.asarray(datasplit[i].split()),dtype=float)) The variable "datasplit" is a comma-separated list of…
1
vote
0 answers

Parallelizing for loops in python

I know similar questions on this topic have been asked before, but I'm still struggling to make any headway with my problem. Basically, I have three dataframes (of sizes 402 x 402, 402 x 3142, and 1 x 402) and I'm combining elements from them into…
mikdale
  • 11
  • 1
1
vote
1 answer

joblib Parallel running out of memory

I have something like this outputs = Parallel(n_jobs=12, verbose=10)(delayed(_process_article)(article, config) for article in data) Case 1: Run on ubuntu with 80 cores: CPU(s): 80 Thread(s) per core: 2 Core(s) per socket: …
suprita shankar
  • 1,554
  • 2
  • 16
  • 47
1
vote
1 answer

Nested parallelism with scikit learn models

I want to to do nested parallelism with scikit learn logisticregressionCV inside a for loop: for i in range(0,10): logisticregressionCV(n_jobs=-1) I want to parallelize the for loop as well. I read a lot of post but I couldn't understand…
1
vote
0 answers

Parallel file reading in python

I have been trying to read a large file and writing to another file at the same time after processing the data from the input file, the file is pretty huge around 4-8 GB, is there a way to parallelise the process to save the time The original…
Harsh Sharma
  • 183
  • 1
  • 10
1
vote
1 answer

Specify a number of ipengine instances to be launched within ipyparallel cluster

Speaking of ipyparallel, is it possible to specify a number of ipengines to simultaneously launch on a slave machine, and if so - how do I do it? For example, one can specify a number of engines to launch on localhost with ipcluster start -n…
Vasily
  • 2,192
  • 4
  • 22
  • 33
1
vote
0 answers

Optimization: alternatives to passing large array to map in ipyparallel?

I originally wrote a nested for loop over a test 3D array in python. As I wanted to apply it to larger array which would take a lot more time, I decided to parallelise using ipyparallel by writing it as a function and using bview.map. This way I…