14

Searching for names(text) with spaces in it, causing problem to me, I have mapping similar to

"{"user":{"properties":{"name":{"type":"string"}}}}"

Ideally what it should return and rank results as follows

1) Bring on top names that exact match the search term (highest score)
2) Names that starts with the search term (high score)
3) Names that contains the exact search term as substring (medium score)
4) Names that contains any of the search term token  (lowest score)

Example For following names in elasticsearch

Maaz Tariq
Ahmed Maaz Tariq
Maaz Sheeba
Maaz Bin Tariq
Sana Tariq
Maaz Tariq Ahmed

Searching for "Maaz Tariq" , Results should be in following order

Maaz Tariq (highest score)
Maaz Tariq Ahmed (high score)
Ahmed Maaz Tariq (medium score)
Maaz Bin Tariq  (lowest score)
Maaz Sheeba (lowest score)
Sana Tariq (lowest score)

Can any one point me how and which analyzers to use? and how to rank the search results for names?

maaz
  • 4,371
  • 2
  • 30
  • 48

3 Answers3

11

You can use the multi field type, a bool query and the custom boost factor query to solve this problem.

Mapping:

{
    "mappings" : {
        "user" : {        
            "properties" : {
                "name": {
                    "type": "multi_field",
                    "fields": {
                        "name": { "type" : "string", "index": "analyzed" },
                        "exact": { "type" : "string", "index": "not_analyzed" }
                    }
                }
            }
        }
    }
}

Query:

{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "name": "Maaz Tariq"
                    }
                }
            ],
            "should": [
                {
                    "custom_boost_factor": {
                        "query": {
                            "term": {
                                "name.exact": "Maaz Tariq"
                            }
                        },
                        "boost_factor": 15
                    }
                },
                {
                    "custom_boost_factor": {
                        "query": {
                            "prefix": {
                                "name.exact": "Maaz Tariq"
                            }
                        },
                        "boost_factor": 10
                    }
                },
                {
                    "custom_boost_factor": {
                        "query": {
                            "match_phrase": {
                                "name": {
                                    "query": "Maaz Tariq",
                                    "slop": 0
                                }
                            }
                        },
                        "boost_factor": 5
                    }
                }
            ]
        }
    }
}

edit:

As pointed out by javanna, the custom_boost_factor isn't needed.

Query without custom_boost_factor:

{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "name": "Maaz Tariq"
                    }
                }
            ],
            "should": [
                {
                    "term": {
                        "name.exact": {
                            "value": "Maaz Tariq",
                            "boost": 15
                        }
                    }
                },
                {
                    "prefix": {
                        "name.exact": {
                            "value": "Maaz Tariq",
                            "boost": 10
                        }
                    }
                },
                {
                    "match_phrase": {
                        "name": {
                            "query": "Maaz Tariq",
                            "slop": 0,
                            "boost": 5
                        }
                    }
                }
            ]
        }
    }
}
Ivaldi
  • 660
  • 6
  • 22
  • I would prefer a filter based solution but I couldn't found the right filter for the 3. requirement. – Ivaldi May 23 '13 at 17:21
  • You can just make a phrase query for that. Also, I don't get why you need a custom_boost_factor query. Can't you just give a different weight to your different queries using the `boost` option? – javanna May 23 '13 at 18:37
  • `Boost` isn't allowed in a `should` sub query!? (At least I don't know the syntax for this.) And how does a phrase query filter work without the `span_near` query and without the `match_phrase` query? – Ivaldi May 23 '13 at 19:20
  • As far as I remember every query supports the boost. Actually, the phrase query is what you are doing with match_phrase, didn't realize that! – javanna May 23 '13 at 20:31
  • I tried it. If I add a `boost` to the sub query it becomes invalid. – Ivaldi May 23 '13 at 20:51
  • You might be doing something wrong then I'm afraid. Have a look at this gist: https://gist.github.com/imotov/b62f45dcf28a5c030e67 . – javanna May 23 '13 at 21:05
  • You are right, thanks. My mistake was that I used `query` instead of `value` in the term query. – Ivaldi May 23 '13 at 22:19
  • @Ivaldi : great stuffbro.. it helped :) – Mithun Satheesh Nov 22 '13 at 05:03
0

In case of Java Api, when quering exact strings with spaces use;

CLIENT.prepareSearch(index)
    .setQuery(QueryBuilders.queryStringQuery(wordString)
    .field(fieldName));

In a lot of other queries, you get nothing as result

Danielson
  • 2,605
  • 2
  • 28
  • 51
0

And from Elasticsearch 1.0:

"title": {
    "type": "multi_field",
    "fields": {
        "title": { "type": "string" },
        "raw":   { "type": "string", "index": "not_analyzed" }
    }
}

became:

"title": {
    "type": "string",
    "fields": {
        "raw":   { "type": "string", "index": "not_analyzed" }
    }
}

https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html

Yan Burtovoy
  • 198
  • 1
  • 2
  • 15