I am upgrading my Elasticsearch server from version 1.6.0 to 7.12.1, which made me rewrite every query I had.
Those queries retrieves materials identified by 3 field : nature.idCat
, nature.idNat
and marque.idMrq
(category ID, nature ID and brand ID).
I have a searching field on my application to search for specific materials, so if the user enter "photoc", the query sent to my Elasticsearch server looks like this :
{
"sort": [
"_score"
],
"query": {
"bool": {
"must": [
{
"query_string": {
"default_field": "search",
"query": "*photoc*",
"boost": 10
}
},
[...] // Some more irrelevant conditions for this question like
// if nature.idCat = 26 then idNat must be in some range and idMrq in some other range
]
}
}
}
And 2 examples of "hits" results of this query :
"hits": [
{
"_index": "ref_biens",
"_type": "_doc",
"_id": "T3RrpXsBz_TibRxz0akC",
"_score": 13.0,
"_source": {
"search": "Photocopieur GENERIQUE",
"nature": {
"idCat": 26,
"idNat": 665,
"libelle": "Photocopieur",
"ekip": "U03C",
"codeINSEE": 300121,
"noteMaterielArrondi": 5
},
"marque": {
"idMrq": 16,
"libelle": "GENERIQUE",
"ekip": "Z999",
"idVRDuree": 808
}
}
},
{
"_index": "ref_biens",
"_type": "_doc",
"_id": "UHRrpXsBz_TibRxz0akC",
"_score": 13.0,
"_source": {
"search": "Photocopieur INFOTEC",
"nature": {
"idCat": 26,
"idNat": 665,
"libelle": "Photocopieur",
"ekip": "U03C",
"codeINSEE": 300121,
"noteMaterielArrondi": 5
},
"marque": {
"idMrq": 1244,
"libelle": "INFOTEC",
"ekip": "I091",
"idVRDuree": 808
}
}
}
]
This works perfectly !
My problem appears when the user types more than one word, for example if he is searching specifically for the "Photocopieur PANASONIC", the results of the query shows the right material as the first result with a _score
of 23 but then every other match has the same _score
of 13 which can bring some totally different material as the next results (matching only on the brand name for example) even though I whish for other "Photocopieur" to be displayed first.
The way I'm thinking of doing it is by adding "score points" to results that have the most similarities to the best match, for instance I would add a 6 point boost for the same nature.idCat
, 4 points for the same nature.idNat
and finally 2 points for the same marque.idMrq
.
Any idea on how I can achieve that ? Is this the correct approach to my problem ?