3

I have a basic aggregation on an index with about 40 million documents.

{
    aggs: {
        countries: {
            filter: {
                bool: {
                    must: my_filters,
                }
            },
            aggs: {
                filteredCountries: {
                    terms: {
                        field: 'countryId',
                        min_doc_count: 1,
                        size: 15,
                    }
                }
            }
        }
    }
}

The index:

{
    "settings": {
        "number_of_shards": 5, 
        "analysis": {
            "filter": {
                "autocomplete_filter": {
                    "type": "edge_ngram",
                    "min_gram": 1,
                    "max_gram": 20
                }
            },
            "analyzer": {
                "autocomplete": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": [
                        "lowercase",
                        "autocomplete_filter",
                        "unique"
                    ]
                }
            }
        },
    },
    "mappings": {
        "properties": {
            "id": {
                "type": "integer"
            },
            "name": {
                "type": "text",
                "analyzer": "autocomplete",
                "search_analyzer": "standard"
            },
            "countryId": {
                "type": "short"
            }
        }
    }
}

The search response time is 100ms, but the aggregation response time is about 1.5s, and is increasing as we add more documents (was about 200ms with 5 million documents). There are about 20 distinct countryId right now.

What I tried so far:

  1. Allocating more RAM (from 4GB to 32GB), same results.
  2. Changing countryId field data type to keyword and adding eager_global_ordinals option, it made things worse

The elasticsearch version is 7.8.0, elastic has 8GB of ram, the server has 64GB of ram and 16CPU, 5 shards, 1 node

I use this aggregation to put filters in search results, so I need it to respond as fast as possible. For large number of results I don't need precision. so if it is approximate or even limited to a number (ex. 100 gte) it's great.

Any ideas how to speed up this aggregation ?

Valera
  • 2,665
  • 2
  • 16
  • 33

1 Answers1

0

Reason for the slowness:

  1. Bucket explosion is the reason. And breadth first collect mode would speed up further.

As per the doc, you can optimize further with breadth first collect mode.

Even though the number of actors may be comparatively small and we want only 50 result buckets there is a combinatorial explosion of buckets during calculation - a single actor can produce n² buckets where n is the number of actors. To find 10 popular actors and thir 5 top coactors.

  1. I would suggest you to set Execution hint. Since you have very less unique values, I suggest you to set hint as map.

  2. Another optimization, let's say some documents are not accessed in last few weeks, you can use a field from your filter, to partition the aggregation on particular set of documents.

  3. Another optimization that you could exclude, include what countries needed, if possible in your use case. Filter

Gibbs
  • 21,904
  • 13
  • 74
  • 138
  • Thank you for suggestions. Breadth first didn't change anything, we have only about 20 unique terms, so partitioning would also not really help, as of filtering, we can't remove any countries from search. – Valera Jun 27 '20 at 21:39
  • Did you try setting execution hint as map? Can you add ayour filters too. – Gibbs Jun 27 '20 at 21:41
  • execution_hind didn't help, filters are only a `match` by `name` field, but the filter is surely not the problem, because without them (empty aggregation with no filters) is slower that the aggregation with filters – Valera Jun 27 '20 at 21:45
  • What did you set for execution hint? Ordinal or map? – Gibbs Jun 27 '20 at 21:48
  • Was it better or worst? I mean time taken. We could compare. How many nodes&shards you have? – Gibbs Jun 27 '20 at 21:49
  • map is about 200ms slower than ordinal on average. 5 shards, 1 node – Valera Jun 27 '20 at 21:53