Each process spawned has it's own separate memory, which can make communicating between them difficult. You probably need to look into using Manager
from the multiprocessing
library. Here is a small example script I made to process images, where white pixels were replaced with transparent pixels. Note how when the process is made, it is given the data it has to work with in a JoinableQueue and able to access the same information between the processes via the new_data
variable that is made as an array handled by Manager
.
from PIL import Image
from multiprocessing import Process, JoinableQueue, Manager
from time import time
def worker_function(q, new_data):
while True:
# print("Items in queue: {}".format(q.qsize()))
index, pixel = q.get()
if pixel[0] > 240 and pixel[1] > 240 and pixel[2] > 240:
out_pixel = (0, 0, 0, 0)
else:
out_pixel = pixel
new_data[index] = out_pixel
q.task_done()
if __name__ == "__main__":
start = time()
q = JoinableQueue()
my_image = Image.open('InputImage.jpg')
my_image = my_image.convert('RGBA')
datas = list(my_image.getdata())
manager = Manager()
new_data = manager.list([(0, 0, 0, 0)]*len(datas))
print('putting image into queue')
for count, item in enumerate(datas):
q.put((count, item))
print('starting workers')
for i in range(50):
p = Process(target=worker_function, args=(q, new_data))
p.daemon = True
p.start()
print('main thread waiting')
q.join()
print("Saving Image")
my_image.putdata(new_data)
my_image.save('output.png', "PNG")
end = time()
print('{:.3f} seconds elapsed'.format(end - start))
It isn't entirely clear how you expect your program to function, so this is the best I can do. I will note that if it's possible for you to work with a Threading.Thread
object it would probably simplify your code to do that. For example as an exercise I made the same script shown above with Theading as well:
from PIL import Image
from threading import Thread
from queue import Queue
import time
start = time.time()
q = Queue()
planeIm = Image.open('InputImage.jpg')
planeIm = planeIm.convert('RGBA')
datas = planeIm.getdata()
new_data = [0] * len(datas)
print('putting image into queue')
for count, item in enumerate(datas):
q.put((count, item))
def worker_function():
while True: # Threads will loop forever
# print("Items in queue: {}".format(q.qsize())) # If you want to have the progress printed
index, pixel = q.get() # grab a item from the stack we are processing
if pixel[0] > 240 and pixel[1] > 240 and pixel[2] > 240:
out_pixel = (0, 0, 0, 0)
else:
out_pixel = pixel
new_data[index] = out_pixel # Process the item by putting the output at the specified index (may fill this array out of order)
q.task_done() # Mark this item as completed to remove it from the Queue
print('starting worker threads')
for i in range(100):
t = Thread(target=worker_function)
t.daemon = True
t.start()
print('main thread waiting')
q.join() # This will wait for all items in the queue to be marked as complete
print('Queue has been joined')
planeIm.putdata(new_data)
planeIm.save('output.png', "PNG")
end = time.time()
elapsed = end - start
print('{:3.3} seconds elapsed'.format(elapsed))
It may not be immediately obvious, but the threaded version is much simpler because it makes it so you don't have to handle the index location of items in the queue, doesn't require a fancy manager array, and also usually runs faster because Processes take a long time to spawn