I have a set of large tiff images (3200 x 3200). I want to load them in jupyter notebook, compute averages and create a numpy array. To make it concrete, each movie clip is composed of 10 tiff images and there are several movie clips. I want to compute the average of 10 frames for each movie, and return an array of size (# movie clips x 3200 x 3200).
from tifffile import tifffile
def uselibtiff(f):
tif = tifffile.imread(f) # open tiff file in read mode
return tif
def avgFrames(imgfile_Movie,im_shape):
# calculating the sum of all frames of a movie. imgfile_Movie is a list of tiff image files for a particular movie
im = np.zeros(im_shape)
for img_frame in imgfile_Movie:
frame_im = uselibtiff(img_frame)
im += frame_im
return im
# imgfiles_byMovie is a 2D list
im_avg_all = np.zeros((num_movies,im_shape[0],im_shape[1]))
for n,imgfile_Movie in enumerate(imgfiles_byMovie):
print(n)
#start = timeit.default_timer()
results = avgFrames(imgfile_Movie,im_shape)/num_frames
#end = timeit.default_timer()
#print(end-start)
im_avg_all[n] = results
so far I've noticed that avgFrames takes about 3 seconds (for 10 frames per movie). I wanted to speed things up and tried using multiprocessing in python.
import multiprocessing as mp
print("Number of processors: ", mp.cpu_count())
pool = mp.Pool(mp.cpu_count())
results = [pool.apply(avgFrames, args=(imgfile_Movie, im_shape)) for imgfile_Movie in imgfiles_byMovie]
pool.close()
Unfortunately, the above code runs very slow and I have to terminate it without seeing the results. What am I doing wrong here?