Optimize performance for calculation of euclidean distance between two images

Question

I implemented the k-nearest-neighbours algorithm in python to classify some randomly picked images from the mnist database. However I found my distance function to be quite slow: An analisys of 10 test images against the training set of 10k images takes about 2mins. The images have a resolution of 28x28 pixels. Since I'm new to python I got the feeling this could be faster. The function is supposed to calculate the euclidean distance between two same-sized grayscale images.

def calculateDistance(image1, image2):
    distance = 0
    for i in range(len(image1)):
        for j in range(len(image1)):
            distance += math.pow((image1[i][j]-image2[i][j]),2)
    distance = numpy.sqrt(distance)
    return distance

This question seems like a good fit for [Code Review](http://codereview.stackexchange.com/). — TigerhawkT3, Oct 16 '15 at 22:07

score 9 · Accepted Answer · answered Oct 16 '15 at 22:23

9

If you're using numpy arrays to represent the images, you could use the following instead:

def calculateDistance(i1, i2):
    return numpy.sum((i1-i2)**2)

This should be much faster because it uses a fast C implementation for the heavy lifting. Also consider using caching to not compute the difference of two images twice.

answered Oct 16 '15 at 22:23

Niklas B.

92,950
18
194
224

yes! this gives me a huge speed-up! takes now ~8secs to classify 10 test images. Thanks! – wodzu Oct 17 '15 at 10:51

score 0 · Answer 2 · answered Oct 16 '15 at 22:14

0

1) compute the difference between the two images into a temporary variable then multiply that variable by itself (operation on integers) instead of doing Math.pow which is a floating point operation 2) if you're just comparing distances eg to find the pair with the smallest distance, don't bother sqrt'ing at the end (this won't actually speed things up all that much because it's not in the loop but still not needed of you're only using the result for relative comparisons)

answered Oct 16 '15 at 22:14

qoba

251
1
6

I tried this, but for some reason uknown to me it caused the classifcations to be less successful. It also did not speed up the calculation time - It took even more time (about 10mins). – wodzu Oct 17 '15 at 10:49

Optimize performance for calculation of euclidean distance between two images

2 Answers2