0

Can someone help me craft an Elasticsearch query that is likely to time out on a few thousand records/documents? I would like to see what is actually returned when an aggregation request times out. Is this documented anywhere?

My attempt so far:

POST /myindex/_search?size=0
{
    "aggs" : {
        "total-cost" : {
            "sum" : {
                "field" : "cost",
                "missing": 1 
            }
        }
    }
}

The reason for this question is sometimes in production I get a response that's missing the "total-cost" aggregation. I have a hunch it might be due to timeouts. That's why I want to see what is returned exactly when a request times out.

I've also looked at how to set the request timeout in the Kibana console, and apparently there is no way to do this.

NB. I am talking about search timeouts, not connection timeouts.

Sergey Slepov
  • 1,861
  • 13
  • 33

1 Answers1

0

As per my understanding, query_timeout will not work as expected in Elasticsearch. Because there are few reasons for it.

Elasticsearch execute query in two phase when you send request to the cluster. One phase is Query Phase and second is Fetch Phase. So when you specify timeout, this does cause elastic to return a partial response after the timeout has elapsed (ish), it doesn't prevent the server from finishing the query execution and is therefore no use in limiting server load.

Please check warning in timeout documentation.

It’s important to know that the timeout is still a best-effort operation; it’s possible for the query to surpass the allotted timeout. There are two reasons for this behavior:

Timeout checks are performed on a per-document basis. However, some query types have a significant amount of work that must be performed before documents are evaluated. This "setup" phase does not consult the timeout, and so very long setup times can cause the overall latency to shoot past the timeout.

Because the time is once per document, a very long query can execute on a single document and it won’t timeout until the next document is evaluated. This also means poorly written scripts (e.g. ones with infinite loops) will be allowed to execute forever.

Now, you might have question that in this scenario cluster will be going down or OutOfMoemory exception will be occurs. So in this scenario you can handle this with circuit Breakers settings.

Please check github issue #60037

Sagar Patel
  • 4,993
  • 1
  • 8
  • 19
  • Thanks for your answer. My quetsion was not about how to avoid timeouts or manage server memory. It was about the protocol, the error reporting API. What does ES return when a request times out? No results? Zero results? Partial results? – Sergey Slepov Mar 07 '22 at 11:08
  • 1
    As i mentioned in my response it will return partial result and in your query response there will be one key called `timeout` with value `true` – Sagar Patel Mar 07 '22 at 11:25