I want to rank 100 documents based on similarity. For example 10 documents will be similar say (A, A', A'', A''',...) and another set of 10 documents could be similar say (B, B', B'', B''', ...). Now documents should be ranked as A, A'', A''', ..., B, B', B''', ... and so on.
Similarity metric is based on usage of words. After ranking, use case is to arrange documents for reading so that similar documents are read together like A, A'', A''', ..., B, B', B''', ..., Z, Z', Z''.
Can I use TF-IDF to achieve this ranking? Is there any C library for doing this?