I have been using Elasticsearch 7.6 and PHP client API for all the operations. I have created elasticsearch index settings and mappings as follows
$params = [
'index' => $index,
'body' => [
'settings' => [
"number_of_shards" => 1,
"number_of_replicas" => 0,
"index.queries.cache.enabled" => false,
"index.soft_deletes.enabled" => false,
"index.refresh_interval" => -1,
"index.requests.cache.enable" => false,
"index.max_result_window"=> 2000000
],
'mappings' => [
'_source' => [
"enabled" => false
],
'properties' => [
"text" => [
"type" => "text",
"index_options" => "docs"
]
]
]
]
];
My Boolean OR search query is as follows
$json = '{
"from" : 0, "size" : 2000000,
"query": {
"bool": {
"filter": {
"match" : {
"text" : {
"query" : "apple orange grape banana",
"operator" : "or"
}
}
}
}
}
}';
I have indexed 2 million documents in such a way that all the documents match the query and I am also getting all the documents as expected. Since I am matching all the documents I have avoided scoring by using a filter in the bool query.
But in my log file, I am repetitively getting the following message until the query is finished executing. Sometimes I used to get the same message when indexing the documents in bulk
[2020-05-15T19:15:45,720][INFO ][o.e.m.j.JvmGcMonitorService] [node1] [gc][14] overhead, spent [393ms] collecting in the last [1.1s]
[2020-05-15T19:15:47,822][INFO ][o.e.m.j.JvmGcMonitorService] [node1] [gc][16] overhead, spent [399ms] collecting in the last [1s]
[2020-05-15T19:15:49,827][INFO ][o.e.m.j.JvmGcMonitorService] [node1] [gc][18] overhead, spent [308ms] collecting in the last [1s]
I have given 16 GB for my heap memory. No other warnings are shown in the elasticsearch log. What could be the reason for it? or is it expected when retrieving a huge number of documents?. I understand about scroll API but I am curious about why this is happening when I use large value for index.max_result_window. Help is much appreciated? Thanks in advance!