Is it possible to find similarities between rows in a matrix without loop?

Question

i have a 2D numpy array. I'm trying to compute the similarities between rows and put it into a similarities array. Is this possible without loop? Thanks for your time!

# ratings.shape = (943, 1682)

arri = np.zeros(943)
arri = np.where(arri == 0)[0]

arrj = np.zeros(943)
arrj = np.where(arrj ==0)[0]

similarities = np.zeros((ratings.shape[0], ratings.shape[0]))

similarities[arri, arrj] = np.abs(ratings[arri]-ratings[arrj])

I want to make a 2D-array similarities in that similarities[i, j] is the differentiation between row i and row j in ratings

[ValueError: shape mismatch: value array of shape (943,1682) could not be broadcast to indexing result of shape (943,)] [1][1]: https://i.stack.imgur.com/gtst9.png

I want to make a 2D-array ```similarities``` in that similarities[i, j] is the differentiation between row i and row j in ```ratings```. — diepitus, Jun 08 '21 at 16:59

Habetuz · Accepted Answer · 2021-06-08T20:55:41.547

The problem is how numpy iterates through the array when indexing a two-dimentional array with two arrays.

First some setup:

import numpy;

ratings = numpy.arange(1, 6)

indicesX = numpy.indices((ratings.shape[0],1))[0]
indicesY = numpy.indices((ratings.shape[0],1))[0]

ratings: [1 2 3 4 5]

indicesX: [[0][1][2][3][4]]

indicesY: [[0][1][2][3][4]]

Now lets see what your program produces:

similarities = numpy.zeros((ratings.shape[0], ratings.shape[0]))
similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[0])

similarities:

[[0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 2. 0. 0.]
 [0. 0. 0. 3. 0.]
 [0. 0. 0. 0. 4.]]

As you can see, numpy iterates over similarities basically like the following:

for i in range(5):
    similarities[indicesX[i], indicesY[i]] = numpy.abs(ratings[i]-ratings[0])

similarities:

[[0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 2. 0. 0.]
 [0. 0. 0. 3. 0.]
 [0. 0. 0. 0. 4.]]

Now instead we need indices like the following to iterate through the entire array:

indecesX = [0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4,0,1,2,3,4]
indecesY = [0,0,0,0,0,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4]

We do that the following:

# Reshape indicesX from (x,1) to (x,). Thats important for numpy.tile().
indicesX = indicesX.reshape(indicesX.shape[0])
indicesX = numpy.tile(indicesX, ratings.shape[0])

indicesY = numpy.repeat(indicesY, ratings.shape[0])

indicesX: [0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4]

indicesY: [0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4]

Perfect! Now just call similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[indicesY]) again and we see:

similarities:

[[0. 1. 2. 3. 4.]
 [1. 0. 1. 2. 3.]
 [2. 1. 0. 1. 2.]
 [3. 2. 1. 0. 1.]
 [4. 3. 2. 1. 0.]]

Here the whole code again:

import numpy;

ratings = numpy.arange(1, 6)

indicesX = numpy.indices((ratings.shape[0],1))[0]
indicesY = numpy.indices((ratings.shape[0],1))[0]

similarities = numpy.zeros((ratings.shape[0], ratings.shape[0]))

indicesX = indicesX.reshape(indicesX.shape[0])
indicesX = numpy.tile(indicesX, ratings.shape[0])

indicesY = numpy.repeat(indicesY, ratings.shape[0])

similarities[indicesX, indicesY] = numpy.abs(ratings[indicesX]-ratings[indicesY])
print(similarities)

PS

You commented on your own post to improve it. You should edit your question instead of commenting on it, when you want to improve it.

Thank you for your help. Your explanation is nice and easy to understand. I got it. You save me :D. Thanks for all! — diepitus, Jun 08 '21 at 20:52
I'm happy I could help you :). Maybe you can accept my answer (would be my first accepted)! — Habetuz, Jun 08 '21 at 20:54

Is it possible to find similarities between rows in a matrix without loop?

1 Answers1

PS