I am trying to figure out if the list of objects I'm including in my list comprehension iterable for the pool.map function are being accessed directly by the workers or if the works are receiving copies of the lists.
I'm using DEAP and have successfully implemented multiprocessing. Originally I would read CSV files into arrays and send the arrays to the workers but then each worker would need to to create its own list of objects. I thought if I could create the objects once and send copies of them it might speed it up.
It appears to work identically both ways aside from a small speedup from creating the objects lists in the main context. I want to make sure workers aren't modifying the same list objects.
I create the lists by reading data from CSV files and appending objects to my lists defined in the main context.
Example:
if __name__ == '__main__':
start_time = time.time()
pool = multiprocessing.Pool(processes=multiprocessing.cpu_count())
toolbox.register("map", pool.map)
_jobsArray = []
with open('.\Datasets\Jobs.csv') as csvfile:
readCSV = csv.reader(csvfile, delimiter=',')
next(readCSV) # Skip the header
for row in readCSV:
Job(row[0], literal_eval(row[1]), literal_eval(row[3]), literal_eval(row[2]), literal_eval(row[4])))
Later I use list comprehension to create the necessary iterable:
fitnesses = toolbox.map(toolbox.evaluate, [(indiv, _jobsArray, _machinesArray) for indiv in invalid_ind])
I then unpack the tuple in my evaluation function and use the objects.
I used prints in different stages to try to determine the states of the lists. Based on a print right after the unpacking of the tuple, the lists are in their default state at the beginning of the evaluation. At the end of the evaluation they are both different from the original state and in enough cases different between members of the population (some identical individuals are expected).
It appears the workers are not all editing the original lists created in the main context. Can someone confirm this?