IF I use tf-idf feature representation (or just document length normalization), then is euclidean distance and (1 - cosine similarity) basically the same? All text books I have read and other forums, discussions say cosine similarity works better for text...
I wrote some basic code to test this and found indeed they are comparable, not exactly same floating point value but it looks like a scaled version. Given below are the results of both the similarities on simple demo text data. text no.2 is a big line of about 50 words, rest are small 10 word lines.
Cosine similarity: 0.0, 0.2967, 0.203, 0.2058
Euclidean distance: 0.0, 0.285, 0.2407, 0.2421
Note: If this question is more suitable to Cross Validation or Data Science, please let me know.