0

I have 3 master node and 5 data node on this instance type m4.large.elasticsearch (2cCPU & 8gb memory) with storage of 512gb.

Please suggest max number of shard and replica i can create for above configuration.

Kumaran
  • 21
  • 7
  • You're looking at the problem from the wrong end, you don't size your cluster based on your hardware, but based on your data. First start from the data (volume, indexing frequency, queries, mappings, etc) and then iterate from there in order to decide how to shape your cluster (hardware, index, shard size, etc). – Val Jun 22 '18 at 07:22
  • These are the data expected to grow - june 2018 - 1022gb - shard 34, sept 2018 - 3395gb -shard 114, june 2019 -4820gb -shard 160. I have calculated number of shard by this formula = index size/30gb. Please suggest i can create 114 shard in above configuration. Suggest how many data node i can keep. – Kumaran Jun 22 '18 at 08:32

1 Answers1

2

You may have as many numbers of shards and replica depending upon your volume size and usage.

Replicas are primarily for search performance, and a user can add or remove them at any time. They give you additional capacity, higher throughput, and stronger failover. It is always recommend a production cluster to have 2 replicas for failover. Also note doubling the number of replicas will also double your disk space usage.

The number of shards you can hold on a node will be proportional to the amount of heap you have available, but there is no fixed limit enforced by Elasticsearch. A good rule-of-thumb is to ensure you keep the number of shards per node below 20 to 25 per GB heap it has configured. A node with a 30GB heap should therefore have a maximum of 600-750 shards, but the further below this limit you can keep it the better. This will generally help the cluster stay in good health. After you configure an Elasticsearch cluster, it's critically important to realize that you cannot modify the shard allocation later. If you later find it necessary to change the number of shards, then you would need to reindex all the source documents. (Although reindexing is a long process, it can be done without downtime).

  • These are the data expected to grow - june 2018 - 1022gb - shard 34, sept 2018 - 3395gb -shard 114, june 2019 -4820gb -shard 160. I have calculated number of shard by this formula = index size/30gb. Please suggest i can create 114 shard in above configuration. Suggest how many data node i can keep. – Kumaran Jun 22 '18 at 08:34