0

I have an index with two fields:

  • name: uuid
  • version: long

I now only want to count the documents (on a very large index [1 million+ entries]) where the version of the name is the highest. For e.g. a query on an index with the following documents:

{name="a", version=1} 
{name="a", version=2}
{name="a", version=3}
{name="b", version=1}

... would return:

count=2

Is this somehow possible? I can not find a solution for this particular problem.

Julian Pieles
  • 3,880
  • 2
  • 23
  • 33

1 Answers1

1

You are effectively describing a count of distinct names, which you can do with a cardinality aggregation.

Request:

GET test1/_search
{
    "aggs" : {
        "distinct_count" : {
            "cardinality" : {
                "field" : "name.keyword"
            }
        }
    },
    "size": 0
}

Response:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 4,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "distinct_count" : {
      "value" : 2
    }
  }
}
Adam T
  • 1,481
  • 11
  • 20
  • Thank you for your answer but this does not deliver the count of the highest version for a name. – Julian Pieles Dec 03 '19 at 10:42
  • Yes it does. "value": 2 at the bottom of the object. – Adam T Dec 03 '19 at 11:53
  • Ah I see what you mean! It does not really matter what the highest version is. My question is not correct. I will mark this as the correct answer and open a new one. Thank you for your help! – Julian Pieles Dec 03 '19 at 13:22