0

I have a question about two t-SNE plots I made. I have a set of 850 articles for which I wanted to check which articles are similar to each other. This was done by pre-processing the articles first, then making a tf-idf vector of the whole set and making a t-SNE plot of this tf-idf, one with cosine distances and one with euclidean distances.

However, they both look very similar, it looks a bit like that only the axes are switched or something... Is there any logical reasoning for this?

The colors are the labels an article got from a simple sentiment analysis.

This is the Cosine distances

Above the Cosine Distances

This is the Euclidean distances

Above the Euclidean distances

Thanks for any help in advance!

HenkieTee
  • 21
  • 5

1 Answers1

0

The test result indicates that Euclidean distance and cosine distance are likely the same distance function (up to certain scaling factor) for the specific type of data. You could verify this by heatmaps of the two distance matrixes.

James LI
  • 133
  • 1
  • 8