I'm trying speed up a simple symmetrically centered image downsampling algorithm in Python. I've coded this up using a naive approach as a lower bound benchmark, however I'd like to get this to work significantly faster.
For simplicity's sake, my image is a circle at a resolution 4608x4608 (I'll be working with resolutions of this scale) with which I would like a downsampled image by a factor of 9 (i.e. 512x512). Below is the code I've generated that creates the image in high resolution and downsamples it by a factor of 9.
All this basically does is maps a pixel from high res. space onto one in low res space (symmetrically around the centroid) and sums all pixels in a given area in high res to that one pixel in low res.
import numpy as np
import matplotlib.pyplot as plt
import time
print 'rendering circle at high res'
# image dimensions.
dim = 4608
# generate high sampled image.
xx, yy = np.mgrid[:dim, :dim]
highRes = (xx - dim/2) ** 2 + (yy - dim/2) ** 2
print 'render done'
print 'downsampling'
t0 = time.time()
# center of the high sampled image.
cen = dim/2
ds = 9
# calculate offsets.
x = 0
offset = (x-cen+ds/2+dim)/ds
# calculate the downsample dimension.
x = dim-1
dsdim = 1 + (x-cen+ds/2+dim)/ds - offset
# generate a blank downsampled image.
lowRes = np.zeros((dsdim, dsdim))
for y in range(0, dim):
yy = (y-cen+ds/2+dim)/ds - offset
for x in range(0, dim):
xx = (x-cen+ds/2+dim)/ds - offset
lowRes[yy, xx] += highRes[y, x]
t1 = time.time()
total = t1-t0
print 'time taken %f seconds' % total
I have numpy with BLAS and LAPACK set up on my machine and I know a significant gain can be achieved by taking advantage of this, however I'm a bit stuck on how to proceed. This is my progress so far.
import numpy as np
import matplotlib.pyplot as plt
import time
print 'rendering circle at high res'
# image dimensions.
dim = 4608
# generate high sampled image.
xx, yy = np.mgrid[:dim, :dim]
highRes = (xx - dim/2) ** 2 + (yy - dim/2) ** 2
print 'render done'
print 'downsampling'
t0 = time.time()
# center of the high sampled image.
cen = dim/2
ds = 9
# calculate offsets.
x = 0
offset = (x-cen+ds/2+dim)/ds
# calculate the downsample dimension.
x = dim-1
dsdim = 1 + (x-cen+ds/2+dim)/ds - offset
# generate a blank downsampled image.
lowRes = np.zeros((dsdim, dsdim))
ar = np.arange(0, dim)
x, y = np.meshgrid(ar, ar)
# calculating symettrically centriod positions.
yy = (y-cen+ds/2+dim)/ds - offset
xx = (x-cen+ds/2+dim)/ds - offset
# to-do : code to map xx, yy into lowRes
t1 = time.time()
total = t1-t0
print 'time taken %f seconds' % total
This current version is about 16x faster on my machine but it is not complete. I'm not exactly sure how to map the new downsampled pixels from the high res. image efficiently.
There might be another way to speed it up? Not sure... Thanks!