7

I am trying to find the cosine similarity between 2 vectors (x,y Points) and I am making some silly error that I cannot nail down. Pardone me am a newbie and sorry if I am making a very simple error (which I very likely am).

Thanks for your help

  public static double GetCosineSimilarity(List<Point> V1, List<Point> V2)
    {
        double sim = 0.0d;
        int N = 0;
        N = ((V2.Count < V1.Count)?V2.Count : V1.Count);
        double dotX = 0.0d; double dotY = 0.0d;
        double magX = 0.0d; double magY = 0.0d;
        for (int n = 0; n < N; n++)
        {
            dotX += V1[n].X * V2[n].X;
            dotY += V1[n].Y * V2[n].Y;
            magX += Math.Pow(V1[n].X, 2);
            magY += Math.Pow(V1[n].Y, 2);
        }

        return (dotX + dotY)/(Math.Sqrt(magX) * Math.Sqrt(magY));
    }

Edit: Apart from syntax, my question was also to do with the logical construct given I am dealing with Vectors of differing lengths. Also, how is the above generalizable to vectors of m dimensions. Thanks

Mikos
  • 8,455
  • 10
  • 41
  • 72
  • You are getting mixed up with indices and X, Y. The index in each list should represent components (i.e. 0->x, 1->y, 2->z). The other way you would just have 2 points V1 and V2, each with an x and y, representing a 2-dimensional vector. You do not need both the index n and the .X and .Y – JohnPS Sep 26 '11 at 22:14

2 Answers2

16

If you are in 2-dimensions, then you can have vectors represented as (V1.X, V1.Y) and (V2.X, V2.Y), then use

public static double GetCosineSimilarity(Point V1, Point V2) {
 return (V1.X*V2.X + V1.Y*V2.Y) 
         / ( Math.Sqrt( Math.Pow(V1.X,2)+Math.Pow(V1.Y,2))
             Math.Sqrt( Math.Pow(V2.X,2)+Math.Pow(V2.Y,2))
           );
}

If you are in higher dimensions then you can represent each vector as List<double>. So, in 4-dimensions the first vector would have components V1 = (V1[0], V1[1], V1[2], V1[3]).

public static double GetCosineSimilarity(List<double> V1, List<double> V2)
{
    int N = 0;
    N = ((V2.Count < V1.Count) ? V2.Count : V1.Count);
    double dot = 0.0d;
    double mag1 = 0.0d;
    double mag2 = 0.0d;
    for (int n = 0; n < N; n++)
    {
        dot += V1[n] * V2[n];
        mag1 += Math.Pow(V1[n], 2);
        mag2 += Math.Pow(V2[n], 2);
    }

    return dot / (Math.Sqrt(mag1) * Math.Sqrt(mag2));
}
Tim
  • 88
  • 6
JohnPS
  • 2,518
  • 19
  • 17
1

The last line should be

return (dotX + dotY)/(Math.Sqrt(magX) * Math.Sqrt(magY))
HasaniH
  • 8,232
  • 6
  • 41
  • 59