i am using multiprocessing in python to parallel some computing-heavy functions. but i found that there is a delay in process creating if passing a fat argument (e.g., a 1000-note networkx graph or a 1000000-item list). i experiment on two multiprocessing modules "multiprocessing" and "pathos", get the similar results. my question is how to avoid this kind of delay because it ruins the benefit brought by parallel computing.
in my sample code, i just pass a fat argument to the function for multiprocessing - the function body does not touch the argument as all.
- the sample code using "multiprocessing"
import multiprocessing
import time
def f(args):
(x, conn, t0, graph) = args
ans = 1
x0 = x
t = time.time() - t0
conn.send('factorial of %d: start@%.2fs' % (x0, t))
while x > 1:
ans *= x
time.sleep(0.5)
x -= 1
t = time.time() - t0
conn.send('factorial of %d: finish@%.2fs, res = %d' %(x0, t, ans))
return ans
def main():
var = (4, 8, 12, 20, 16)
p = multiprocessing.Pool(processes = 4)
p_conn, c_conn = multiprocessing.Pipe()
params = []
t0 = time.time()
N = 1000
import networkx as nx
G = nx.complete_graph(N, nx.DiGraph())
import random
for (start, end) in G.edges:
G.edges[start, end]['weight'] = random.random()
for i in var:
params.append((i, c_conn, t0, G))
res = list(p.imap(f, params))
p.close()
p.join()
print('output:')
while p_conn.poll():
print(p_conn.recv())
t = time.time() - t0
print('factorial of %s@%.2fs: %s' % (var, t, res))
if __name__ == '__main__':
main()
the output of the above sample code
output:
factorial of 4: start@29.78s
factorial of 4: finish@31.29s, res = 24
factorial of 8: start@53.56s
factorial of 8: finish@57.07s, res = 40320
factorial of 12: start@77.25s
factorial of 12: finish@82.75s, res = 479001600
factorial of 20: start@100.39s
factorial of 20: finish@109.91s, res = 2432902008176640000
factorial of 16: start@123.55s
factorial of 16: finish@131.05s, res = 20922789888000
factorial of (4, 8, 12, 20, 16)@131.06s: [24, 40320, 479001600, 2432902008176640000, 20922789888000]
Process finished with exit code 0
according to the above output, there is around 24 second delays between two process creating
- the sample code using "pathos"
import pathos
import multiprocess
import time
def f(x, conn, t0, graph):
ans = 1
x0 = x
t = time.time() - t0
conn.send('factorial of %d: start@%.2fs' % (x0, t))
while x > 1:
ans *= x
time.sleep(0.5)
x -= 1
t = time.time() - t0
conn.send('factorial of %d: finish@%.2fs, res = %d' %(x0, t, ans))
return ans
def main():
var = (4, 8, 12, 20, 16)
p = pathos.multiprocessing.ProcessPool(nodes=4)
p_conn, c_conn = multiprocess.Pipe()
t0 = time.time()
conn_s = [c_conn] * len(var)
t0_s = [t0] * len(var)
N = 1000
import networkx as nx
G = nx.complete_graph(N, nx.DiGraph())
import random
for (start, end) in G.edges:
G.edges[start, end]['weight'] = random.random()
res = list(p.imap(f, var, conn_s, t0_s, [G] * len(var)))
print('output:')
while p_conn.poll():
print(p_conn.recv())
t = time.time() - t0
print('factorial of %s@%.2fs: %s' % (var, t, res))
if __name__ == '__main__':
main()
the output of the above sample code,
output:
factorial of 4: start@29.63s
factorial of 4: finish@31.13s, res = 24
factorial of 8: start@53.50s
factorial of 8: finish@57.00s, res = 40320
factorial of 12: start@76.94s
factorial of 12: finish@82.44s, res = 479001600
factorial of 20: start@100.72s
factorial of 20: finish@110.23s, res = 2432902008176640000
factorial of 16: start@123.69s
factorial of 16: finish@131.20s, res = 20922789888000
factorial of (4, 8, 12, 20, 16)@131.20s: [24, 40320, 479001600, 2432902008176640000, 20922789888000]
Process finished with exit code 0
similarly, according to the above output, there is around 24 second delays between two process creating.
if i reduce the graph size (smaller node number), the delay decreases accordingly. i guess it is due to the extra time used for pickling/dilling the networkx graph as an argument. ideally, first 4 processes should be created at the same time. how to avoid this cost? thank you!
UPDATE
Thanks to Alexander's kind answer, i remove the pipe in both "multiprocessing" and "pathos" codes. the "multiprocessing" code performs as Alexander's - delay reduced to 1 second, but the "pathos" code still has more than 20 seconds delay. the revised "pathos" code is posted below,
import pathos
import multiprocess
import time
from pympler import asizeof
import sys
def f(args):
(x, graph) = args
t = time.ctime()
print('factorial of %d: start@%s' % (x, t))
time.sleep(4)
return x
def main():
t0 = time.time()
params = []
var = (4, 8, 12, 20, 16)
p = pathos.multiprocessing.ProcessPool(nodes=4)
N = 1000
import networkx as nx
G = nx.complete_graph(N, nx.DiGraph())
import random
for (start, end) in G.edges:
G.edges[start, end]['weight'] = random.random()
print('Size of G by sys', sys.getsizeof(G), 'asizeof', asizeof.asizeof(G))
print('G created in %.2f' % (time.time() - t0))
for i in var:
params.append((i, G))
res = list(p.imap(f, params))
p.close()
p.join()
if __name__ == '__main__':
main()
the output goes as
Size of G by sys 56 asizeof 338079824
G created in 17.36
factorial of 4: start@Fri May 31 11:39:26 2019
factorial of 8: start@Fri May 31 11:39:53 2019
factorial of 12: start@Fri May 31 11:40:19 2019
factorial of 20: start@Fri May 31 11:40:44 2019
factorial of 16: start@Fri May 31 11:41:10 2019
Process finished with exit code 0