4

I was trying to do a simple POC for related items using the elasticsearch's http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-mlt-query.html#query-dsl-mlt-query,

But I was not getting how to use the boosting so that the important fields in my document carry more weight in the final output. Also how can I apply a boost in more like this query such that the recent documents carry more weight.

Thanks.

Global Warrior
  • 5,050
  • 9
  • 45
  • 75

1 Answers1

11

One way to achieve specific boosts if a document is more like a particular doc or if the match is on particular field you can use multiple mlt queries and wrap them in a should clause(bool)/dis_max based on whether you want "sum of"/"max of" logic while scoring:

Example using dis_max would be :

POST  test_index/_search?explain=true
{
    "fields": [
       "field1",
       "field2"
    ], 
    "query": { 
        "dis_max": {
           "queries": [
               {
                "more_like_this" : {
                    "fields" : ["field1"],
                    "like_text" : "this is  some text",
                    "min_term_freq" : 1,
                    "max_query_terms" : 2,
                    "boost": 20
                }
               },
               {
                "more_like_this" : {
                    "fields" : ["field2"],
                    "like_text" : "this is some other text",
                    "min_term_freq" : 1,
                    "max_query_terms" : 2,
                    "boost": 20
                }
               }
            ]
        }
    }
}
acoelhosantos
  • 1,549
  • 14
  • 19
keety
  • 17,231
  • 4
  • 51
  • 56
  • The solution looks great. Let me try this out and reply back. Thanks for answer. – Global Warrior Nov 12 '14 at 13:06
  • Use tie braker so that all fields score are included in final scoring – Global Warrior Dec 08 '14 at 09:11
  • @keety Can you describe better your solution? I'm really interested, but also reading documentation, I've doubts on the meaning of "based on whether you want "max of" or "sum of" logic while scoring ". Can you explain this concept? – andPat Feb 17 '16 at 10:35
  • 4
    in the above example i have used `dis_max` which scores the documents by assigning the `max` of the score of the two `more_like_this` clauses . In case you want the document to be assigned a score that is a sum of the individual `more_like_this` clause scores you would need to wrap in in a `bool` query . See this [thread](http://stackoverflow.com/questions/35440312/how-do-i-include-a-sum-clause-for-two-more-like-this-queries/35442494#comment58587023_35442494) for an example – keety Feb 17 '16 at 15:47