1

I am using python elastic search and I need to do document clustering. I have installed carrot2 ( https://github.com/carrot2/elasticsearch-carrot2 ).

How do I call the carrot clustering module from python after

from elasticsearch import Elasticsearch
es = Elasticsearch()
es.search(.....)
Pratik Poddar
  • 1,353
  • 3
  • 18
  • 36

1 Answers1

0

Carrot2 plugin for ES will give you access to the clustered documents at

http://localhost:9200/_plugin/carrot2/ (or wherever your ES node is deployed)

Look at the usage guide in the github project page.

There is no support for carrot2 cluster access from pyelasticsearch. You can apply carrot2 on top of the search results from pyelasticsearch. Take a look at the carrot2 project if you need to use carrot2. Else take a look at some python text clustering tools here.

Community
  • 1
  • 1
grasskode
  • 158
  • 1
  • 6
  • How do I apply carrot2 on top of search results from pyelasticsearch? – Pratik Poddar Apr 22 '14 at 16:36
  • You will have to build carrot2 clustering as a service and call it from your python project. Any particular reason why you cannot use existing python clustering tools? – grasskode May 08 '14 at 10:45
  • Existing python clustering tools do not take care about the knowledge of the universe. I want to cluster 10 documents into n clusters, using the information of 1000000 documents. I guess carrot2 is the way to go. – Pratik Poddar May 08 '14 at 14:19
  • Can you please check this question: http://stackoverflow.com/questions/23540328/carrot2elasticsearch-basic-flow-of-information – Pratik Poddar May 08 '14 at 14:19
  • Yes, you can move away from pyelasticsearch if you want to use the plugin. REST is the way to go. Will reply on the thread. – grasskode May 09 '14 at 12:21