I want to parallelise the operation of a function on each element of a list using ray. A simplified snippet is below
import numpy as np
import time
import ray
import psutil
num_cpus = psutil.cpu_count(logical=False)
ray.init(num_cpus=num_cpus)
@ray.remote
def f(a, b, c):
return a * b - c
def g(a, b, c):
return a * b - c
def my_func_par(large_list):
# arguments a and b are constant just to illustrate
# argument c is is each element of a list large_list
[f.remote(1.5, 2, i) for i in large_list]
def my_func_seq(large_list):
# arguments a anf b are constant just to illustrate
# argument c is is each element of a list large_list
[g(1.5, 2, i) for i in large_list]
my_list = np.arange(1, 10000)
s = time.time()
my_func_par(my_list)
print(time.time() - s)
>>> 2.007
s = time.time()
my_func_seq(my_list)
print(time.time() - s)
>>> 0.0372
The problem is, when I time my_func_par
, it is much slower (~54x as can be seen above) than my_func_seq
. One of the authors of ray does answer a comment on this blog that seems to explain what I am doing is setting up len(large_list)
different tasks, which is incorrect.
How do I use ray and modify the code above to run it in parallel? (maybe by splitting large_list
into chunks with the number of chunks being equal to the number of cpus)
EDIT: There are two important criteria in this question
- The function
f
needs to accept multiple arguments - It may be necessarry to use
ray.put(large_list)
so that thelarg_list
variable can be stored in shared memory rather than copied to each processor