Questions tagged [pathos]

'pathos' provides a fork of python's 'multiprocessing', where 'pathos.multiprocessing' can send a much broader range of the built-in python types across a parallel 'map' and 'pipe' (similar to python's 'map' and 'apply'). 'pathos' also provides a unified interface for parallelism across processors, threads, sockets (using a fork of 'parallelpython'), and across 'ssh'.

pathos provides a fork of python's multiprocessing, where pathos.multiprocessing can send a much broader range of the built-in python types across a parallel map and pipe (similar to python's map and apply). pathos also provides a unified interface for parallelism across processors, threads, sockets (using a fork of parallelpython), and across ssh.

190 questions
3
votes
1 answer

python multiprocessing.Pool Too many files open logging files

This is in regards to Too many files open with multiprocessing.Pool I have having similar problem. My setup is Ubuntu with quad core running this simple script (python 2.7 with pathos==0.2a1.dev, pathos is only being used to allow mp map to work…
amulllb
  • 3,036
  • 7
  • 50
  • 87
2
votes
2 answers

How to avoid excessive ram consumption using pathos

This is a rough example of how I leverage multiprocessing with pathos: from pathos.multiprocessing import ProcessingPool pool = ProcessingPool(10) results = pool.map(func, args) Each func's run can take a while. Let's say it's 5 minutes, and…
2
votes
0 answers

Parallelisation on databricks with Pathos or Ray

I am forecasting values for several thousand, independent objects. The scripts are executed on databricks. Every forecasts takes several seconds. Therefore, i would like to try parallelisation to hopefully see a speedup. Toy task Suppose i have the…
Simon B
  • 199
  • 1
  • 9
2
votes
0 answers

Pathos p_tqdm multiprocess error with dill

I'm trying to run some code through the p_tqdm library with p_map() to parallelize some code. I run into this dill-related error that I can't figure out. Traceback (most recent call last): File…
Antoine Zambelli
  • 724
  • 7
  • 19
2
votes
1 answer

What is the canonical way to use locking with `pathos.pools.ProcessPool`?

Let's consider the following example: from pathos.pools import ProcessPool class A: def run(self, arg: int): shared_variable = 100 def __run_parallel(arg: int): local_variable = 0 # ... …
p4dn24x
  • 445
  • 4
  • 14
2
votes
1 answer

Is there a `pathos`-specific way to determine the number of CPU cores?

I want to determine the number of physical CPU cores to size the number of nodes for a ProcessPool accordingly. from pathos.pools import ProcessPool process_pool = ProcessPool(nodes=?) I know that psutil provides this number as described…
p4dn24x
  • 445
  • 4
  • 14
2
votes
1 answer

How to shutdown Python Process Pools properly?

I have a use case where I have to process some documents and which takes some time. So I tried batching the documents and multiprocessing them, it worked good and completed in less time as expected. Also there are multiple stages of processing docs,…
2
votes
1 answer

How to execute python subTests in parallel?

Consider the following unittest.TestCase, which implements two versions of the same test, the only difference being that one executes the subTests in parallel using multiprocessing. import multiprocessing as mp from unittest import TestCase class…
dspencer
  • 4,297
  • 4
  • 22
  • 43
2
votes
2 answers

Multiprocessing using Pool class in Python giving Pickling error

I am trying to run a simple multiprocessing example in python3.6 in a zeppelin notebook(in windows) but I am not able to execute it. Below is the code that i used: def sqrt(x): return x**0.5 numbers = [i for i in range(1000000)] with Pool() as…
Shubham Kedia
  • 77
  • 1
  • 5
2
votes
0 answers

Pathos.multiprocessing + memory leakage + amap

I'm building a web scraper and I have a method I am running concurrently as follows: def parallel_scrape(): p= Pool() results = p.amap(self.fetch, domain_list] while not results.ready(): time.sleep(5) It works as expected…
Y. Leonce Eyog
  • 883
  • 2
  • 13
  • 29
2
votes
0 answers

Pathos multiprocessing pool hangs

I'm trying to use multiprocessing inside docker container. However, I'm facing two issues. (I'm using python 2.7) Creating ProcessingPool()/Pool() (I tried both) takes abnormally long time to create. Maybe over a minute or two. After it processes…
Lowan
  • 277
  • 3
  • 9
2
votes
0 answers

Read a lot of data using pool.map ('error("'i' format requires -2147483648 <= number <= 2147483647")')

I'm reading data from databases. I need to read from several servers (nodes) simultaniuosly, so I want to use pool.map. I'm trying to do this way: import pathos.pools as pp import pandas as pd import urllib class DataProvider(): def…
Mikhail_Sam
  • 10,602
  • 11
  • 66
  • 102
2
votes
1 answer

Pipes are getting stuck--no other solution on stack overflow working

(UPDATED) I am building a module to distribute agent based models, the idea is to split the model over multiple processes and then when the agents reach a boundary they are passed to the processor handling that region. I can get the processes set up…
TPike
  • 286
  • 1
  • 3
  • 12
2
votes
1 answer

How to set chunk size when using pathos ProcessingPool's map?

I'm running into inefficient parallelisation with Pathos' ProcessingPool.map() function: Towards the end of the processing, a single slow running worker processes the last tasks in the list sequentially while other workers are idle. I think this is…
DCS
  • 3,354
  • 1
  • 24
  • 40
2
votes
1 answer

Instance attributes do not persist using multiprocessing

I'm having an issue with instances not retaining changes to attributes, or even keeping new attributes that are created. I think I've narrowed it down to the fact that my script takes advantage of multiprocessing, and I'm thinking that changes…
1 2
3
12 13