How can I get an elasticsearch index to a file and then insert that data to another cluster? I want to move data from one cluster to another but I can't connect them directly.
-
Is the elasticsearch version the same on both cluster? – javanna Jul 27 '13 at 09:43
-
Cool, how about the layout: how many nodes? Same number of nodes on both clusters? – javanna Jul 30 '13 at 11:44
-
On the current cluster I have 7 data nodes and one without data only for dispatching with 3 shards and 1 replica. And I want to move all data to another cluster with 4 nodes and 1 dispacher with 4 shards and 1 replica. – voyagersm Jul 30 '13 at 12:22
-
Ok, was asking to understand whether you need to reindex data or you can use the existing index. If you want to have a different number of primary shards you definitely need to, unfortunately. Did you index data from a database? Or is elasticsearch your data storage too? – javanna Jul 30 '13 at 13:23
-
I use elasticsearch for data storage too. – voyagersm Jul 31 '13 at 13:18
-
I'm still searching for a solution. – voyagersm Aug 08 '13 at 15:20
-
Elasticsearch 1.0 has a snapshot API that you can use. Alternatively, see [this python script](http://stackoverflow.com/a/24911018/509706). – Wilfred Hughes Jul 23 '14 at 13:04
3 Answers
If you no need to keep _id the same and only important bit is _source you may use logstash with config:
input { //from one cluster } output { //to another cluster }
here is more info: http://www.logstash.net/docs/1.4.2/
Yes it's method is weird, but I tried it for instant data transfer between clusters index by index and it is working as a charm (of course if you no need to keep _id generated by elasticsearch)

- 41
- 2
There is script which will help you to backup and restore indices from one cluster to another. i didn't tested this but may be it will fix your needs. check this Backup and restore an Elastic search index
And you can also use perl script to copy index from one cluster to another (or the same cluster).
check this link clintongormley/ElasticSearch.pm

- 336
- 1
- 11
-
-
@colidyre check this link but its old and i think they are strongly recommand to use official python api. https://github.com/eriky/ESClient. check this also my forked version has copy script https://github.com/kirubar/ESClient – kiruba Mar 03 '16 at 14:05
I recently tried my hands around this and there are a couple of approaches that can help you.
Use Elasticsearch's Snapshot and Restore APIs. You can take a snapshot at the source cluster and use that snapshot to restore data to your destination cluster.
If your setup allows installing external packages, you can use Elasticdump as well.
HTH!

- 2,892
- 28
- 34