Strategy to persist the node's data for dynamic Elasticsearch clusters

Question

I'm sorry that this is probably a kind of broad question, but I didn't find a solution form this problem yet.

I try to run an Elasticsearch cluster on Mesos through Marathon with Docker containers. Therefore, I built a Docker image that can start on Marathon and dynamically scale via either the frontend or the API.

This works great for test setups, but the question remains how to persist the data so that if either the cluster is scaled down (I know this is also about the index configuration itself) or stopped, and I want to restart later (or scale up) with the same data.

The thing is that Marathon decides where (on which Mesos Slave) the nodes are run, so from my point of view it's not predictable if the all data is available to the "new" nodes upon restart when I try to persist the data to the Docker hosts via Docker volumes.

The only things that comes to my mind are:

Using a distributed file system like HDFS or NFS, with mounted volumes either on the Docker host or the Docker images themselves. Still, that would leave the question how to load all data during the new cluster startup if the "old" cluster had for example 8 nodes, and the new one only has 4.
Using the Snapshot API of Elasticsearch to save to a common drive somewhere in the network. I assume that this will have performance penalties...

Are there any other way to approach this? Are there any recommendations? Unfortunately, I didn't find a good resource about this kind of topic. Thanks a lot in advance.

Elasticsearch and NFS are not the best of pals ;-). You don't want to run your cluster on NFS, it's much too slow and Elasticsearch works better when the speed of the storage is better. If you introduce the network in this equation you'll get into trouble. I have no idea about Docker or Mesos. But for sure I recommend against NFS. Use snapshot/restore. — Andrei Stefan, Jun 12 '15 at 07:55
@AndreiStefan Thanks a lot for the insight on NFS. Is the Snapshot API really a way to go if we have 100GBs of data? — Tobi, Jun 12 '15 at 08:11
The subsequent snapshots are incremental. So, the first snapshot will take some time, but the rest of the snapshots should take less space and less time. 100GB of data in total or 100GB on primaries only (no replicas)? Also, note that "incremental" means incremental at file level, not document level. — Andrei Stefan, Jun 12 '15 at 08:24
We will potentiall have a three digit amount of GBs on our primaries... We are still testing different index designs and mappings, so there is no final one yet. We just want to be prepared for production use. — Tobi, Jun 12 '15 at 08:28
I see. It will take some time, yes. But, as I said, the first snapshot will be the heaviest, the subsequent will not be like that. — Andrei Stefan, Jun 12 '15 at 08:31
If I understand correctly, the snapshot will be created from one node, correct? And the restore will be triggered via one node as well, and then distributed to the other nodes in the cluster? — Tobi, Jun 12 '15 at 08:38
The snapshot itself needs all the nodes that have the primaries of the indices you want snapshoted. And those nodes all need access to the common location (the repository) so that they can write to. — Andrei Stefan, Jun 12 '15 at 08:54
Thanks! I was referring to https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html#_snapshot So the node the request was sent to "propagates" it to the other nodes having primaries if I understood correctly. If you'd write a short answer, I'd be happy to accept it. — Tobi, Jun 12 '15 at 08:57
Yes, is just like with any other command. The node gets it, the node knows the cluster state globally so has information about any other node and shards. — Andrei Stefan, Jun 12 '15 at 09:00

score 1 · Accepted Answer · answered Jun 12 '15 at 09:03

Elasticsearch and NFS are not the best of pals ;-). You don't want to run your cluster on NFS, it's much too slow and Elasticsearch works better when the speed of the storage is better. If you introduce the network in this equation you'll get into trouble. I have no idea about Docker or Mesos. But for sure I recommend against NFS. Use snapshot/restore.

The first snapshot will take some time, but the rest of the snapshots should take less space and less time. Also, note that "incremental" means incremental at file level, not document level.

The snapshot itself needs all the nodes that have the primaries of the indices you want snapshoted. And those nodes all need access to the common location (the repository) so that they can write to. This common access to the same location usually is not that obvious, that's why I'm mentioning it.

score 0 · Answer 2 · answered Jun 12 '15 at 10:09

0

The best way to run Elasticsearch on Mesos is to use a specialized Mesos framework. The first effort is this area is https://github.com/mesosphere/elasticsearch-mesos. There is a more recent project, which is, AFAIK, currently under development: https://github.com/mesos/elasticsearch. I don't know what is the status, but you may want to give it a try.

answered Jun 12 '15 at 10:09

rukletsov

1,041
5
7

Thanks for your answer. The first project you linked is more or less dead, the second project you linked is currently in active development IMHO... I don't see exactly where the latter is "better" than running Docker images on Marathon, but maybe you can give some details upon that. I think it's much more flexible (and lightweight) to run an Marathon concerning scaling etc, but that's just my personal view. – Tobi Jun 12 '15 at 10:28
Having a specialized framework gives you a lot of flexibility around the lifetime of your tasks (=elasticsearch nodes) and allows to manage them together as a group. For instance, you can have a set of checks and autoscale your elasticsearch ensemble. You are right about the statuses of both project, however I would encourage you to use the second one and eventually contribute to it : ) – rukletsov Jun 12 '15 at 14:56

Strategy to persist the node's data for dynamic Elasticsearch clusters

2 Answers2