I am attempting to create Pool()
objects, so that I can break down large arrays. Though, each time after the first I run through the below code, the map is never run. Only the first pass seems to enter the function, though the arguments are the same size, even when running it using the EXACT same arguments - only the first
job.map(...)
appears to run. Below is the source of my pain (not all the code in the file):
def iterCount():
#m is not in shared memory, as intended.
global m
m = m + 1
return m
def thread_search(pair):
divisor_lower = pair[0]
divisor_upper = pair[1]
for i in range(divisor_lower, divisor_upper,window_size):
current_section = np.array(x[i: i + window_size])
for row in current_section:
if (row[2].startswith('NP') ) and checkPep(row[0]): #checkPep is a simple unique-in-array checking function.
#shared_list is a multiprocessing.Manager list.
shared_list.append(row[[0,1,2]])
m = iterCount()
if not m%1000000:
print(f'Encountered m = {m}', flush = True)
def poolMap(pairs, group):
job = Pool(3)
print(f'Pool Created')
print(len(pairs))
job.map(thread_search,pairs)
print('Pool Closed')
job.close()
if __name__ == '__main__':
for group in [1,2,3]: #Example times to be run...
x = None
lower_bound = int((group - 1)*group_step)
upper_bound = int(group*group_step)
x = list(csv.reader(open(pa_table_name,"rt", encoding = "utf-8"), delimiter = "\t"))[lower_bound:upper_bound]
print(len(x))
divisor_pairs = [ [int(lower_bound + (i - 1)*chunk_size) , int(lower_bound + i*chunk_size)] for i in range(1,6143) ]
poolMap(divisor_pairs, group)
The output of this function is:
Program started: 03/09/19, 12:41:25 (Machine Time)
11008256 - Length of the file read in (in the group)
Pool Created
6142 - len(pairs)
Encountered m = 1000000
Encountered m = 1000000
Encountered m = 1000000
Encountered m = 2000000
Encountered m = 2000000
Encountered m = 2000000
Encountered m = 3000000
Encountered m = 3000000
Encountered m = 3000000 (Total size is ~ 9 million per set)
Pool Closed
11008256 (this is the number of lines read, correct value)
Pool Created
6142 (Number of pairs is correct, though map appears to never run...)
Pool Closed
11008256
Pool Created
6142
Pool Closed
At this point, the shared_list is saved, and only the first threads results appear to be present.
I'm really at a loss to what is happening here, and I've tried to find bugs (?) or similar instances of any of this.
Ubuntu 18.04 Python 3.6