Python - pool.starmap() running much slower than pool.map()

Question

To clarify first of all, I'm NOT asking why map in multiprocessing is slow. I had code working just fine using pool.map(). But, in developing it (and to make it more generic), I needed to use pool.starmap() to pass 2 arguments instead of one. I'm still fairly new to Python and multiprocessing, so I'm not sure if I'm doing something obviously wrong here. I also couldn't find anything on this that's been previously asked, so apologies if this has already been answered. Using python 3.10 by the way.

I'm processing just under 5 million items in a list, and managed to get a result in just over 12 minutes (instead of a predicted 6 1/2 days if it was run iteratively!) when using pool.map(). I'm essentially obtaining the intersection of List_A and List_B, but I need to preserve the frequencies of each item, and so have to do it in O(n^m).

But now, I'm getting significantly longer times when using pool.starmap(). I can't seem to work out why, and if anyone can give me some indiciation it would be greatly appreciated!

Here is the code for pool.map() that works quickly as expected (where List_B is actually part of the listCompare function:

def listCompare(list_A):
    toReturn = []
    for item in list_A:
        if item in list_B:
            toReturn.append(item)
    return toReturn

out = []
chnks = chunks(list_a, multiprocessing.cpu_count())
with multiprocessing.Pool() as pool:
    for result in pool.map(listCompare, chnks):
        out.extend(result)            
print("Parallel:", out)

Here is the code for pool.starmap() that works slowly. listCompare is modified to take 2 arguments here: (I can't use my chunks method here, as I can't pass the yeild into the tuple, so I've set the chunksize differently. Is this the reason for the slow down?)

def listCompare(list_A, list_B):
    toReturn = []
    for item in list_A:
        if item in list_B:
            toReturn.append(item)
    return toReturn

with multiprocessing.Pool() as pool:
    for resultA in pool.starmap(listCompare, [(list_a1, list_b1)], chunksize=multiprocessing.cpu_count()):
        output_list1.extend(resultA)
    for resultB in pool.starmap(listCompare, [(list_a2, list_b2)], chunksize=multiprocessing.cpu_count()):
        output_list2.extend(resultB)
    for resultC in pool.starmap(listCompare, [(list_a3, list_b3)], chunksize=multiprocessing.cpu_count()):
        output_list3.extend(resultC)
    for resultD in pool.starmap(listCompare, [(list_a4, list_b4)], chunksize=multiprocessing.cpu_count()):
        output_list4.extend(resultD)

Thanks in advance, and apologies if I've missed out anything that may help in answering! As I said earlier, I know this can be done with intersection, but I need the frequencies of each occurance, so I need to preserve duplicates.

Your example does not seem equivalent for map and starmap. What is `list_a1..._a4` and `list_b1...b4`? Are they subsections of `list_a` and `list_b`, in which case why have you divided the lists in the first place (on top using chunks) in the starmap approach? — Charchit Agarwal, Jul 27 '22 at 19:29
`pool.starMap(listCompare, [(list_a1, list_b1)], ...)` starts exactly one process and waits for the result. Then you're starting a second process and waiting for the result. Obviously you're not getting any multiprocessing happening here. — Frank Yellin, Jul 27 '22 at 19:29
@Charchit The starmap version allows me to pass different lists. They’re not subsections, just poorly named variables in this example I guess. — Kav, Jul 27 '22 at 19:33
@FrankYellin I don’t understand how pool.starmap isn’t multiprocessing based on the chunksize parameter? — Kav, Jul 27 '22 at 19:35
@Kav the chunksize is used divide the size of the iterable each process gets passed in one go. Here the size of the iterable is already 1 (you are passing a list of length one containing a tuple), so the chunk parameter could very well be omitted as its not doing anything. — Charchit Agarwal, Jul 27 '22 at 19:38
@Charchit Ah yes. I think I’ve got confused with my own chunk method and trying to work around not being able to use it in starmap. — Kav, Jul 27 '22 at 19:43
@Kav. pool.starmap expects a list of tuples. Each tuple is the arguments you're calling the function with. In your code, you're calling it with `[(list_a1, list_b2)]`, a list with length 1, containing a single tuple of length 2. You are calling your function once. — Frank Yellin, Jul 28 '22 at 02:29
@FrankYellin Ah, I see. So, if I need to compare list_a and list_b, then compare list_c and list_d, then compare list_e and list_f, and so on… is there a way to do that with starmap? Otherwise I’d have to do it with map but hardcode the second lists into the listCompare method which I’d like to avoid. — Kav, Jul 28 '22 at 06:19
"I'm essentially obtaining the intersection of List_A and List_B, but I need to preserve the frequencies of each item, and so have to do it in O(n^m)." What **exactly** are you doing? Why does it have to be O(N^M)? — juanpa.arrivillaga, Jul 28 '22 at 07:51

score 0 · Accepted Answer · answered Jul 28 '22 at 07:40

As pointed out in the comments, the speed is slow here because you are passing the whole iterable to a single process (as the argument length is one), rather than spread it over multiple processes.

I would recommend to keep using map here, using functools.partial to fix argument containing list_B, when passing the target function:

from functools import partial

def listCompare(list_B, list_A):
    toReturn = []
    for item in list_A:
        if item in list_B:
            toReturn.append(item)
    return toReturn


with multiprocessing.Pool() as pool:
    for resultA in pool.map(partial(listCompare, list_b1), list_a1, chunksize=len(list_a1//multiprocessing.cpu_count())):
        output_list1.extend(resultA)

Keep in mind that the order of the keyword arguments has been swapped in listCompare for this to work (list_B comes before list_A). The chunking is then done using chunksize parameter (admittedly, a very crude version of it, you may want to fine tune it).

Alternatively, if you want to use starmap, then you may keep using your chunk method like below:

def listCompare(list_A, list_B):
    toReturn = []
    for item in list_A:
        if item in list_B:
            toReturn.append(item)
    return toReturn

with multiprocessing.Pool() as pool:
    chunks = chunk(list_a, multiprocessing.cpu_count())
    for resultA in pool.starmap(listCompare, [(c, list_b1) for c in chunks]):
        output_list1.extend(resultA)

Since you already have chunked the iterable on your own, you no longer need to pass a chunksize argument.

As a sidenote, if the iterable you pass to the pool is too long, you may want to use imap instead to lazily iterate rather than storing the whole iterable in memory.

[(c, list_b1) for c in chunks] This is exactly what I was looking for, thank you! — Kav, Jul 28 '22 at 08:25

Python - pool.starmap() running much slower than pool.map()

1 Answers1