3

We have an ES cluster at AWS running with the following setup:

(I know, i need minimum 3 master nodes)

  • 1 Coordinator
  • 2 Data nodes
  • 1 Master Node

Data nodes spec:

  • CPU: 8 Cores
  • Ram: 20GB
  • Disk: 1TB ssd 4000 IOPS

Problem:

ES endpoints for Search, Delete, Backup, Cluster Heatlh, Insert are working fine.

Since yesterday some endpoints like /_cat/indices, /_nodes/_local/stats and etc, started to take too long to respond(more than 4 minutes) :( and consequently our Kibana is in red state(Timeout after 30000ms)

Useful info:

All Shards are OK (3500 in total)
The cluster is in green state
X-pack disabled
Average of 1gb/shard
500k document count.
Requests made by localhost at AWS
CPU, DISK, RAM, IOPS all fine

Any ideas?

Thanks in advance :)

EDIT/SOLUTION 1:

After a few days i found out what was the problem, but first a little bit context...

We use Elasticsearch for storing user audit messages, and mobile error messages, at the first moment (obiviously in a rush to deliver new microservices and remove load from our MongoDB cluster) we designed elasticsearch indices by day, so every day a new indice was created and at the end of the day that indice had arround 6 ~ 9gb of data. Six months later, almost 180 indices bigger, and 720 primary shards open we bumped into this problem.

Then i did read this again(the basics!) : https://www.elastic.co/guide/en/elasticsearch/reference/current/_basic_concepts.html

After talking to the team responsible for this microservice we redesigned our indices to a monthly index, and guess what? problem solved!

Now our cluster is much faster than before and this simple command saved me some sweet nights of sleep.

https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html

Thanks!

0 Answers0