I'm trying to determine the similarity between two documents using carrot. Is it possible get this similarity directly from the framework?
Additionally I've been studying the tf-idf matrix and realized that the rows correspond to the stemmed all words and columns to documents. However, how can I identify which document corresponds to which column?
For example, suppose a list of documents, the column order will be the order of the documents in the list?
Ex:
List docs = {doc1, doc2, doc3}
and
Column 0 = doc1 Coluns 1 = doc2
...
Is this?