Yesterday I learnt that the cosine similarity, defined as
can effectively measure how similar two vectors are.
I find that the definition here uses the L2-norm to normalize the dot product of A
and B
, what I am interested in is that why not use the L1-norm of A
and B
in the denominator?
My teacher told me that if I use the L1-norm in the denominator, then cosine similarity would not be 1 if A=B
. Then, I further ask him, if I modify the cosine similarity definition as follows, what the advantages and disadvantages the modified model are, as compared with the original model?
sim(A,B) = (A * B) / (||A||1 * ||B||1) if A!=B
sim(A,B) = 1 if A==B
I would appreciate if someone could give me some more explanations.