4

ES Version: 1.5 (Amazon Elasticsearch)

My goal: Have search results with deduplication on a certain field. I am currently doing some research with aggregation that deals with the deduplication. So, my result is a list buckets with 1-sized buckets. However, I can't find a way to order the list of buckets.

Current query:

curl -XGET "http://localhost:9200/myidx/product/_search?search_type=count" -d '{
   "size": 2, 
   "query": {
      "function_score": {
         "field_value_factor": {
           "field": "relevance",
           "factor": 2.0
         },
         "query":  { "term": { "title": "abcd" } },
         "score_mode": "multiply",
         "boost_mode": "multiply"
      }
   },
   "aggs": {
      "unique": {
         "terms": {
           "field": "groupid",
           "size": 2
         },
         "aggs": {
           "sample": {
             "top_hits": {
               "size": 1
             }
           }
         }
      }
   }
}'

Result:

{ ...
"aggregations": {
    "unique": {
      "doc_count_error_upper_bound": 1,
      "sum_other_doc_count": 39,
      "buckets": [
        {
          "key": 717878424,
          "doc_count": 14,
          "sample": {
            "hits": {
              "total": 14,
              "max_score": 45.856163,
              "hits": [
                {
                  "_index": "myidx",
                  "_type": "product",
                  "_id": "89531",
                  "_score": 45.856163,
                  "_source": { ... }
                }
              ]
            }
          }
        },
        {
          "key": 717878423,
          "doc_count": 8,
          "sample": {
            "hits": {
              "total": 8,
              "max_score": 68.78424,
              "hits": [
                {
                  "_index": "myidx",
                  "_type": "product",
                  "_id": "89517",
                  "_score": 68.78424,
                  "_source": { ... }
                }
              ]
            }
          }
        }
      ]
    }
  }
}

I would like to see the second bucket with the max_score=68.78424 as the first. Is this possible?

If aggregations is not a recommended solution, please tell.

tokosh
  • 1,772
  • 3
  • 20
  • 37

1 Answers1

4

Yes, you can do it by adding another sub-aggregation on the max document score and sorting the unique terms aggregation by that score.

curl -XGET "http://localhost:9200/myidx/product/_search?search_type=count" -d '{
   "size": 2, 
   "query": {
      "function_score": {
         "field_value_factor": {
           "field": "relevance",
           "factor": 2.0
         },
         "query":  { "term": { "title": "abcd" } },
         "score_mode": "multiply",
         "boost_mode": "multiply"
      }
   },
   "aggs": {
      "unique": {
         "terms": {
           "field": "groupid",
           "size": 2,
           "order": {
              "max_score": "desc"
           }
         },
         "aggs": {
           "max_score": {
             "max": {
               "script": "doc.score"
             }
           },
           "sample": {
             "top_hits": {
               "size": 1
             }
           }
         }
      }
   }
}'
Val
  • 207,596
  • 13
  • 358
  • 360
  • 1
    Didn't mention more explicitly: I use ES on AWS. Amazon does not permit "script". Do you know whether there is a workaround in not using "script"? The answer still helps me understand the aggregation, so thanks. – tokosh Dec 10 '15 at 07:53
  • By "ES on AWS", you mean that you're using the new [Amazon Elasticsearch](https://aws.amazon.com/elasticsearch-service/) service and not a custom install on AWS EC2, right? – Val Dec 10 '15 at 08:00
  • 1
    I'm afraid, I don't see any solution. You'd have to do it in your client-side application. – Val Dec 11 '15 at 04:13
  • I accept this as an answer to this question. So the solution I'm probably going for is one where I index the documents with an extra field "group-head" and then filter on it to deal with the deduplication and the order should be as usual. – tokosh Dec 11 '15 at 05:43