Elasticsearch search response pick(latency) occurs when _refresh with G1GC

Question

Our elasticsearch cluster has 3 master nodes and 12 data nodes installed on 2 IDCs.

In front of elasticsearch, there is search api server that multisearches three indexes. It shows an average response time of less than 200 ms in normal times.

The index in question is one of three indexes called multisearch, which is a configuration of 4 shards and 2 replicas. We incrementally index the corresponding index through the bulk API every 5 minutes.

The problem arises here. At every incremental index, some search requests are delayed by more than 2-3 seconds.

We have confirmed the following facts.

We tried changing the size of the bulk API to 1, 100, 200, 500, 1000, 2000, but delays occur in all cases. CPU, Memory, and Disk IO do not show any peculiarities in the delay timing Set refresh_interval to -1 and no delay occurs when indexing If _refresh is called some time after indexing, then a delay occurs. When changing jdk's GC from G1GC to CMSGC, there is no delay If you check the gc count with jstat at the delay timing, only young gc, which usually occurs periodically, occurs. Much less heap memory and slightly less CPU resources are used in CMSGC than in G1GC, but this seems to be a natural phenomenon. We wonder why the problem occurs and how to solve it right. Any advice is appreciated.

elasticsearch version : 7.16.3
java heap size settings : 30Gb
machine memory : 64Gb

as always with GC issues: enable GC logging, look for large pauses and try to find out what causes them — the8472, Mar 20 '23 at 10:57
Oh i did it already but any large pauses not discovered.. g1gc shows shorter pauses than cms gc — Seong Gwon Min, Mar 22 '23 at 02:05

Elasticsearch search response pick(latency) occurs when _refresh with G1GC

0 Answers0