I am trying to migrate (copy) 35 million documents (which is a standard amount, not too big) between couchbase to elasticsearch.
My elasticsearch (version 1.3) cluster composed from 3 A3 (4 cores, 7 GB memory) CentOS Severs on Microsoft Azure (each server equals to a large server on Amazon)..
I used "timing data flow" indexing to store the docuemnts. each index represents a month and composed by 3 shards and 2 replicas.
when i start the migration script i see that the insertion time is becoming very slow (about 10 documents per second) and the load average of each server in the cluster jumping over than 1.5. In addition, the JVM memory is being increased almost to 100% while the cpu shows 20% and the IOps shows 20 at max. (i used Marvel CNC to get all these data)
- Does anyone faced these kind of indexing problems in elasticsearch?
- I would like to know if there are any parameters that i should be aware about to extend java memory?
- is my cluster specifications good enough to handle 100 indexing per second.
- is the indexing time depends on how big is the index? and should it be that slow?
Thnx Niv