1

Can aggregated buckets from composite aggregation be merged?

Sample bucket from composite aggregation:

{
  {"unique_app_id": "app_0001", "app_title": "calendar v1"}
  {"unique_app_id": "app_0001", "app_title": "calendar v2"}
  {"unique_app_id": "app_0001", "app_title": "calendar v3"}
}

What I'm trying to get:

{
  {"unique_app_id": "app_0001", "app_title": "calendar v3"}
}

This app has the following requirements:

  • support semantic version
  • app title is in multi-language
  • new version can be released with a modified title

The app catalog (the ES query I'm having trouble) requirements:

  • sort by app title
  • pagination
  • only show latest version

The problem:

  • because each version can have a different title, sorting with composite aggregation generates bucket per each unique title
  • which cause duplicated app shows in the app catalog

Here is the sameple data:

PUT trial_index/_doc/1
{
    "version": "1.2.3",
    "major_version": 1.0,
    "minor_version": 2.0,
    "patch_version": 3.0,
    "default_language": "english",
    "language_data": {
        "english": {
            "app_title": "calendar v1"
        },
        "french": {
            "app_title": "calendrier v1"
        },
        "spanish": {
            "app_title": "calendario v1"
        }
    },
    "unique_app_id": "app_0001"
}

PUT trial_index/_doc/2
{
    "version": "2.1.0",
    "major_version": 2.0,
    "minor_version": 1.0,
    "patch_version": 0.0,
    "default_language": "english",
    "language_data": {
        "english": {
            "app_title": "calendar v2"
        },
        "french": {
            "app_title": "calendrier v2"
        },
        "spanish": {
            "app_title": "calendario v2"
        }
    },
    "unique_app_id": "app_0001"
}

Query in simple:

sort by title (composite aggregation)
  group by unique_app_id (terms aggregation)
    pick the latest version (top_hits aggregation)

Query in full:

GET trial_index/_search
{
  "from": 0,
  "size": 0,
  "aggregations": {
    "my_pager": {
      "composite": {
        "size": 100,
        "sources": [{
            "sort_by_title": {
              "terms": {
                "script": {
                  "source": """

// return title of requested language
// if app title does not exist in requested language, return default

def source = params['_source'];
if (source['language_data'].containsKey(params.requestedCode)) {
    return source['language_data'][params.requestedCode]['app_title']
} else {
    return source['language_data'][doc['default_language.keyword'].value]['app_title']
}
                  """,
                  "lang": "painless",
                  "params": {
                    "requestedCode": "french"
                  }
                },
                "missing_bucket": true,
                "order": "asc"
              }
            }
          },
          {
            "page_key": {
              "terms": {
                "field": "unique_app_id.keyword",
                "missing_bucket": false,
                "order": "asc"
              }
            }
          }
        ]
      },
      "aggregations": {
        "application": {
          "terms": {
            "field": "unique_app_id.keyword",
            "size": 10,
            "min_doc_count": 1,
            "shard_min_doc_count": 0,
            "show_term_doc_count_error": false,
            "order": [{
                "_count": "desc"
              },
              {
                "_key": "asc"
              }
            ]
          },
          "aggregations": {
            "latest_version": {
              "top_hits": {
                "from": 0,
                "size": 1,
                "version": false,
                "seq_no_primary_term": false,
                "explain": false,
                "_source": {
                  "includes": [
                    "unique_app_id",
                    "version"
                  ],
                  "excludes": []
                },
                "script_fields": {
                  "default_language": {
                    "script": {
                      "source": """
// return title of requested language
// if app title does not exist in requested language, return default

[params['_source']['language_data'].containsKey(params.requestedCode) ? (params['_source']['language_data'][params.requestedCode]['app_title']) : (params['_source']['language_data'][doc['default_language.keyword'].value]['app_title'])]
                      """,
                      "lang": "painless",
                      "params": {
                        "requestedCode": "french"
                      }
                    },
                    "ignore_failure": false
                  }
                },
                "sort": [{
                    "major_version": {
                      "order": "desc"
                    }
                  },
                  {
                    "minor_version": {
                      "order": "desc"
                    }
                  },
                  {
                    "patch_version": {
                      "order": "desc"
                    }
                  }
                ]
              }
            }
          }
        }
      }
    }
  }
}
RNA
  • 1,164
  • 2
  • 19
  • 35
  • @Gibbs, thanks for the clue, would you be able to elaborate little bit more? – RNA Jun 26 '20 at 02:27
  • What do you mean by new version can be released with modified title? Can you add an example and what does your query outputs currently? – Gibbs Jun 26 '20 at 02:46
  • title for v1 "my calendar" and title for v2 "my awesome calendar" can be different per version – RNA Jun 26 '20 at 04:05
  • My intention was to get a catalog containing the lastest `version` of each `app` sorted by `app_title`, but currently, it does not work. Let's say there are `v1`~`v5` for an app. If each version has different `app_title`, all `v1`, `v2`, `v3`, `v4`, `v5` show up in the catalog. I just want the latest item `v5`. – RNA Jun 26 '20 at 04:11
  • Title is a text correct? I couldn't find in your query that you are sorting that. But you mentioned in the word. Do you mean last version u need that is based on the number after v? – Gibbs Jun 26 '20 at 04:14
  • Yes the `app_title` is in text. In my query example, there is `sort_by_title` about line 10 under `composite` aggregation. `app version` is on semantic version like 1.2.3, but I simplified it v1 and v2 and so on. You can safely assume v1 ~ v5 as if we're only sorting about the major version. – RNA Jun 26 '20 at 05:24

0 Answers0