I have one 1D array of shape (300, )
and a 2D array of shape (400, 300)
. Now, I want to compute the cosine similarity between each of the rows in this 2D array to the 1D array. Thus, my result should be of shape (400, )
which represents how similar these vectors are.
My initial idea is to iterate thru the rows in 2D array using a for
loop and then compute cosine similarity between vectors. Is there a faster alternative using broadcasting method?
Here is a contrived example:
In [29]: vec = np.random.randn(300,)
In [30]: arr = np.random.randn(400, 300)
Below is the way I want to calculate the similarity between 1D arrays:
inn = (vec * arr[0]).sum()
vecnorm = numpy.sqrt((vec * vec).sum())
rownorm = numpy.sqrt((arr[0] * arr[0]).sum())
similarity_score = inn / vecnorm / rownorm
How can I generalize this to arr[0]
being replaced with a 2D array?