5

Currently we're using Couchbase and ElasticSearch(2.x) and replicating data from CB to ES successfully using elasticsearch-transport-couchbase plugin.

The problems began while upgrading to ES 5.6.4. Up until now, we used a single index in ES, and due to the fact that ElasticSearch doesn't recommend this approach anymore we are now trying to create multiple indices in ES (index per type) That means that we need a way to replicate data from CB (A single bucket) to ES (multiple indices).

What is the best way to approach this? Possible solutions:

  1. Continue using the elasticsearch-transport-couchbase plugin, but then we'll have to create a lot (~150) XDCR replications, 1 replication per type. I doubt this will scale..
  2. Write our own solution using Spark or Kafka (Neither of them are on Technological stack so implementation might take time, so it's not the most favourable solution)

Any help would be appreciated.

shays10
  • 509
  • 5
  • 18
  • How you are going to decide which data goes to which index . – Ankur Jyoti Phukan Dec 31 '17 at 09:21
  • Our key for documents in CB is type::uuid, we can check via the '::' delimiter easily. – shays10 Dec 31 '17 at 09:27
  • Any sample data you can provide. And error logs while indexing process if you have .. – Ankur Jyoti Phukan Dec 31 '17 at 09:41
  • 1
    It can be a simple JSON doc with key "user::123" where user is the entity type (and the index that this doc should be indexed to) and 123 is the key. No errors to post since I don't have a solution a yet. Currently trying to continue using the elasticsearch-transport-plugin and redirect the documents to the correct index via ES Ingestion API. Will update if it works. – shays10 Jan 03 '18 at 11:59

1 Answers1

1

Version 4 of the Couchbase Elasticsearch Connector supports the new "index-per-type" model (and other features, including support for ES 6, secure connections, and replication checkpoint management tools). If you'd like to try it out, your feedback would be invaluable.

Disclaimer: I am a Couchbase employee developing the Elasticsearch connector.

dnault
  • 8,340
  • 1
  • 34
  • 53
  • 1
    Thanks, looking into it! That would be very helpful :) Can we use it to route document with id "user::123" to "user" index? Or to rephrase the question, can we use it to define a delimiter (in this case, double colon) and the prefix of the delimiter will automatically be set as the index in ES? From what I saw (after quickly browsing the toml properties file) we can explicitly define behaviours for specific types (the airline_ prefix to airlines index) but it can't dynamically infer the index name. Am I correct? – shays10 Sep 26 '18 at 13:43
  • @shays10 You are correct. That sounds like a great idea though! Would you like to add this feature request to the project's [issue tracker](https://github.com/couchbase/couchbase-elasticsearch-connector/issues)? – dnault Sep 26 '18 at 16:30