0

I am modifying the source code of carrot for a project. Acc. to the LINGO algorithm, it first generates the most probable labels and builds clusters which best suit the labels right? So, can I input my own set of labels to the carrot to see how it clusters the documents around them?

mkj
  • 2,761
  • 5
  • 24
  • 28
sir_osthara
  • 154
  • 2
  • 9

1 Answers1

0

Unfortunately, you can't provide your own labels for clustering with Lingo.

On the other hand, the label-to-document assignment algorithm is very simple in Lingo -- if the document contains the label's words, it will be assigned to the label. Therefore, you can achieve the same effect by, for example, indexing your documents in Lucene and then querying the index using the predefined labels you have.

Stanislaw Osinski
  • 1,231
  • 1
  • 7
  • 9
  • Acc. to the answer I presume the most important and unique aspect of LINGO is its label creation ?? – sir_osthara Nov 18 '14 at 09:49
  • Correct. Take a look at [Carrot2 publications](http://project.carrot2.org/publications.html) for some papers, such as [Lingo: Search Results Clustering Algorithm Based on Singular Value Decomposition](http://www.cs.put.poznan.pl/dweiss/site/publications/download/iipwm-osinski-weiss-stefanowski-2004-lingo.pdf) or [A Concept-Driven Algorithm for Clustering Search Results](http://doi.ieeecomputersociety.org/10.1109/MIS.2005.38). – Stanislaw Osinski Nov 18 '14 at 21:41