I am working on python code using multiprocessing. Below is the code
import multiprocessing
import os
def square(n):
#logger.info("Worker process id for {0}: {1}".format(n, os.getpid()))
logger.info("Evaluating square of the number {0}".format(n))
print('process id of {0}: {1}'.format(n,os.getpid()))
return (n * n)
if __name__ == "__main__":
# input list
mylist = [1, 2, 3, 4, 5,6,7,8,9,10]
# creating a pool object
p = multiprocessing.Pool(4)
# map list to target function
result = p.map(square, mylist)
print(result)
The number of CPU cores in my server is 4. If I use 4 only single processes is initiated. In general, it should start 4 separate processes right?.
If I set the value to 8 in the Pool object below is the response I got
process id of 1: 25872
process id of 2: 8132
process id of 3: 1672
process id of 4: 27000
process id of 6: 25872
process id of 5: 20964
process id of 9: 25872
process id of 8: 1672
process id of 7: 8132
process id of 10: 27000
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
This started 5 separate processes(25872,8132,1672,27000,20964) even though there are only 4 cpu cores.
I don't understand why the pool initiated only 1 process when the value is 4 and initiated 5 separate processes when the value is 8.
Can pool object be instantiated with a value greater than the number of CPU cores?
Also what should be the optimal value we should use while instantiating pool object if a list contains a million records?
I have been through official python documentation, but I couldn't find info. Please help