Scaling elastic search for read heavy applications

Question

We have Node JS (Nest JS / Express) - based application services on the GCP cloud.

We are using elastic search to support full-text search on a blog/news website.

Our requirement is to support 2000 reads per second minimum.

While performing load testing, we observed that until a concurrency of 300 is reached, elastic search performs well and response times are acceptable.

CPU usage also spikes under this load. But, when the load is increased to 500 or 1000, CPU usage drops, and response times increase drastically.

What we don't understand is, why our CPU usage is 80% for a load of 300 and just 30 ~ 40% when load increases. Shouldn't CPU pressure increase with load?

What is the right way to tune elastic search for read-heavy usage? (Our write frequency is just 1 document in 2-3 hours)

We have one single index with approx 2 million documents. The index size is just 6GB.

Elastic cluster is deployed on Kubernetes using helm charts with:

 - 1 dedicated master node 
 - 1 dedicated coordinating node
 - 5 dedicated data nodes

Considering the small data size, the index is not sharded and the number of reading replicas is set to 4.

The index refresh rate is set to 30 sec.

RAM allocated to each data node is 2GB and the heap size is 1GB CPU allocated to each data node is 1 vCPU

We tried to increase the search thread pool size up to 20 and queue_size to 10000 but that didn't help much either

Scaling elastic search for read heavy applications

0 Answers0