Previously I was using a jdbc river to index all the data from mysql to elasticsearch. Now I have shifted to the tire bulk api, as it gives me the freedom to manipulate the data before indexing it into elasticsearch. But the indexing process using the tire bulk api takes lot of time(4 times) as compared to the jdbc river for 3M records. Is there a way to make the indexing process quicker and efficient?
Asked
Active
Viewed 161 times
1 Answers
0
IMHO, the key is that the JDBC river is launched inside Elasticsearch. So after a JDBC request, data are in memory and directly sent to ES.
With an external process, you have one network Hop more.
That said, 4 time lower is perhaps too much.

dadoonet
- 14,109
- 3
- 42
- 49
-
Not neccessarily "too much": depends on which HTTP client is used (keep-alive), it goes over network vs. Java API, etc. – karmi Nov 10 '12 at 08:57