I'm trying to convert a preprocessed dataset stored in the form of pickle objects back into images. There are a total of 820 images of either 227227 or 299299 resolution. I've attached the code below.
The problem is that initially tqdm shows that ~ multiple hundred files are getting converted each second but it almost slows down exponentially, by the 500th file it's down to 1 file a second. I'm not sure what's causing this and have come across suggestions to use concurrency to solve this. I've tried saving the plot using savefig of Matplotlib but run into same slowdown.
I'm wondering what part of the code is causing the slowdown and how to fix it as I've got to convert multiple 100's of pickle files back into images.
EDIT : The problem was the program running out of memory and slowing down.
import _pickle
import matplotlib.pyplot as plt
import os
from tqdm import tqdm
import png
for filename in tqdm(os.listdir(os.getcwd())):
if "pickle" in filename:
try:
with open(filename, 'rb') as inputfile:
im = _pickle.load(inputfile, encoding='latin1')
size = im.shape[0]
ims = Image.fromarray(im.reshape([size,size]))
#plt.imshow(im.reshape([size,size]), cmap = 'gray')
#plt.savefig(filename + '.png')
#ims = ims.resize((size, size), Image.ANTIALIAS) # LANCZOS as of Pillow 2.7
ims.save(filename +'.jpeg', quality = 95)
except _pickle.UnpicklingError:
pass