9

I'm trying to do it by myself the assignments from Stanford CS231n 2017 CNN course.

I'm trying to compute L2 distance using only matrix multiplication and sum broadcasting with Numpy. L2 distance is:

enter image description here

And I think I can do it if I use this formula:

enter image description here

The following code shows three methods to compute L2 distance. If I compare the output from the method compute_distances_two_loops with the output from method compute_distances_one_loop, both are equals. But I compare the output from the method compute_distances_two_loops with the output from the method compute_distances_no_loops, where I have implemented the L2 distance using only matrix multiplication and sum broadcasting, they are different.

def compute_distances_two_loops(self, X):
    """
Compute the distance between each test point in X and each training point
in self.X_train using a nested loop over both the training data and the 
test data.

Inputs:
- X: A numpy array of shape (num_test, D) containing test data.

Returns:
- dists: A numpy array of shape (num_test, num_train) where dists[i, j]
  is the Euclidean distance between the ith test point and the jth training
  point.
"""
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    for i in xrange(num_test):
        for j in xrange(num_train):
            #####################################################################
            # TODO:                                                             #
            # Compute the l2 distance between the ith test point and the jth    #
            # training point, and store the result in dists[i, j]. You should   #
            # not use a loop over dimension.                                    #
            #####################################################################
            #dists[i, j] = np.sqrt(np.sum((X[i, :] - self.X_train[j, :]) ** 2))
            dists[i, j] = np.sqrt(np.sum(np.square(X[i, :] - self.X_train[j, :])))
            #####################################################################
            #                       END OF YOUR CODE                            #
            #####################################################################
    return dists

def compute_distances_one_loop(self, X):
    """
Compute the distance between each test point in X and each training point
in self.X_train using a single loop over the test data.

Input / Output: Same as compute_distances_two_loops
"""
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    for i in xrange(num_test):
        #######################################################################
        # TODO:                                                               #
        # Compute the l2 distance between the ith test point and all training #
        # points, and store the result in dists[i, :].                        #
        #######################################################################
        dists[i, :] = np.sqrt(np.sum(np.square(self.X_train - X[i, :]), axis = 1))
        #######################################################################
        #                         END OF YOUR CODE                            #
        #######################################################################
    print(dists.shape)
    return dists

def compute_distances_no_loops(self, X):
    """
Compute the distance between each test point in X and each training point
in self.X_train using no explicit loops.

Input / Output: Same as compute_distances_two_loops
"""
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    #########################################################################
    # TODO:                                                                 #
    # Compute the l2 distance between all test points and all training      #
    # points without using any explicit loops, and store the result in      #
    # dists.                                                                #
    #                                                                       #
    # You should implement this function using only basic array operations; #
    # in particular you should not use functions from scipy.                #
    #                                                                       #
    # HINT: Try to formulate the l2 distance using matrix multiplication    #
    #       and two broadcast sums.                                         #
    #########################################################################
    dists = np.sqrt(-2 * np.dot(X, self.X_train.T) +
                    np.sum(np.square(self.X_train), axis=1) +
                    np.sum(np.square(X), axis=1)[:, np.newaxis])
    print(dists.shape)
    #########################################################################
    #                         END OF YOUR CODE                              #
    #########################################################################
    return dists

You can find a full working testable code here.

Do you know what am I doing wrong in compute_distances_no_loops, or wherever?

UPDATE:

The code that throws the error message is:

dists_two = classifier.compute_distances_no_loops(X_test)

# check that the distance matrix agrees with the one we computed before:
difference = np.linalg.norm(dists - dists_two, ord='fro')
print('Difference was: %f' % (difference, ))
if difference < 0.001:
    print('Good! The distance matrices are the same')
else:
    print('Uh-oh! The distance matrices are different')

And the error message:

Difference was: 372100.327569
Uh-oh! The distance matrices are different
VansFannel
  • 45,055
  • 107
  • 359
  • 626

5 Answers5

12

Here is how you can compute pairwise distances between rows of X and Y without creating any 3-dimensional matrices:

def dist(X, Y):
    sx = np.sum(X**2, axis=1, keepdims=True)
    sy = np.sum(Y**2, axis=1, keepdims=True)
    return np.sqrt(-2 * X.dot(Y.T) + sx + sy.T)
Viktoriya Malyasova
  • 1,343
  • 1
  • 11
  • 25
  • Thanks a lot!!! Your code works like a charm. But, I don't know why, I'm still getting the same message: "Difference was: 372100.327569 - Uh-oh! The distance matrices are different". – VansFannel Nov 26 '20 at 14:14
  • Something that helped me understand the "magic" of numpy that is happening in the operation sx + sy.T: ```python sx = np.sum(X**2, axis=1, keepdims=True); print("sx.shape", sx.shape); # sx.shape (500, 1); sy = np.sum(Y**2, axis=1, keepdims=True); print("sy.T.shape", sy.T.shape); # sy.T.shape (1, 5000); square_sums = sx + sy.T; print("square_sums.shape", square_sums.shape); # square_sums.shape (500, 5000); ``` – Moshe Jonathan Gordon Radian Jul 11 '22 at 09:17
  • Note that this works because if you distribute `(X-Y)^2` = `(X-Y)*(X-Y)` you get `X^2 + Y^2 - 2XY` – Luke Jan 03 '23 at 02:32
0

I think that you are looking for the pairwise distance.

There is an amazing trick to do that in a single line. You have to cleverly play with the broadcasting:

X_train = np.expand_dims(self.X_train, 1) # shape: [num_train, 1, D]
X_test = np.expand_dims(X, 0) # shape: [1, num_tests, D]
dists = np.square(X_train - X_test) # Thanks to broadcast [num_train, num_tests, D]
dists = np.sqrt(np.sum(dists, axis=-1)) # [num_train, num_tests]
Intrastellar Explorer
  • 3,005
  • 9
  • 52
  • 119
Guillem
  • 2,376
  • 2
  • 18
  • 35
  • Thanks for your answer. I have tested it, and line `dists = np.square(X_train - X_test) # Thanks to broadcast [num_train, num_tests, D]` throws the error `MemoryError: Unable to allocate 7.15 GiB for an array with shape (5000, 500, 3072) and data type uint8`. – VansFannel Nov 22 '20 at 12:34
  • This is cause you have a lot of data loaded in memory. But you will have se problem with other approaches trying to do this pairwise distance. – Guillem Nov 22 '20 at 13:21
  • This code is part from an assignment from an university and I don't think they, the professors, make it to not been able to execute it unless you have a lot of memory. – VansFannel Nov 22 '20 at 14:58
0

I believe the issue comes from inconsistent array shapes.

#a^2 matrix (500, 1) 
alpha = np.sum(np.square(X), axis=1)
alpha = alpha.reshape(len(alpha), 1)
print(alpha.shape)
#b^2 matrix (1, 5000) 
beta  = np.sum(np.square(self.X_train.T), axis=0)
beta = beta.reshape(1, len(beta))
print(beta.shape)
#ab matrix (500, 5000)
alphabeta = np.dot(X, self.X_train.T)
print(alphabeta.shape)

dists = np.sqrt(-2 * alphabeta + alpha + beta)
Flair
  • 2,609
  • 1
  • 29
  • 41
0

It's a late response, but I've solved it in a different way and wanted to post it. When I was solving this, I was unaware of the numpy's column-row vector subtractions from the matrix. As it turns out, we can subtract nx1 or 1xm vector from nxm and when we do it, subtracts from every row-column vector. If one works with a library that doesn't support this kind of behavior, he/she can use mine. For this situation, I've worked it out the math, and the result is the following one:

sum_x_train=np.sum(self.X_train**2,axis=1, keepdims=True)
sum_x_test=np.sum(X**2,axis=1, keepdims=True)
sum_2x_tr_te=np.dot(self.X_train,X.T)*2
sum_x_train=np.dot(sum_x_train,np.ones((1,X.shape[0])))
sum_x_test=np.dot(sum_x_test,np.ones((1,self.X_train.shape[0])))
dists=np.sqrt(sum_x_test.T+sum_x_train-sum_2x_tr_te).T

The downside of this approach is it uses more memory.

Hakan Akgün
  • 872
  • 5
  • 13
0

This is my solution for function compute_distances_no_loops() that the OP asked for. I don't use the sqrt() function for performance reason:

def compute_distances_no_loops(self, X):
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    #---------------
    # Get square of X and X_train
    X_sq = np.sum(X**2, axis=1, keepdims=True) 
    Xtrain_sq = np.sum(self.X_train**2, axis=1, keepdims=True)
    # Calculate (squared) dists as (X_train - X)**2 = X_train**2 - 2*X_train*X + X**2
    dists = -2*X.dot(self.X_train.T) + X_sq + Xtrain_sq.T
    #---------------  
    return dists
tqbinh
  • 1
  • 1