I'm using elastic search 2.3 I've stored all the mobile products attribute-wise in ES after removing all stopwords (e.g. with, extra, etc)
Sample schema for "Micromax Canvas Doodle 4 white with 8 GB ram and 8gb internal memory":
"_source": {
"internal_mem": "8 GB",
"color": "White",
"brand": "Micromax",
"ram": "8 GB",
"model": "Canvas Doodle 4"
}
ES has thousands of mobile name with these features. Now, I need to do search on these products. For searching, I do have all the products broken down in attributes. So, a search for "canvas doodle 4 gb" will be:
{
"query": {
"bool": {
"should": [{
"match": {
"model": {
"query": "canvas^4 doodle",
"boost": 2
}
}
}, {
"match": {
"internal_mem": {
"query": "4 GB",
"boost": 0.2
}
}
}]
}
}
}
Result I want:
- All products of "canvas doodle 4g" or "canvas doodle" first (sorted by score)
- Then, products having "canvas"
- then "4g"
Rules I've made:
- Model, Brand should have higher priorities as compared to other three
- First word in model/brand should have more importance. e.g. Iphone, canvas etc.
Issues:
Should I use this query or should I go for function_score query (I need custom score as well)?
How to avoid search results for "4" in model? e.g. "4", "mini", "3g", "4g" Should I disable IDF so that such results can be avoided?
Give priorities to first word on model/brand? (assuming they are more important e.g. "canvas" in canvas doodle 3")
Recommended values of "boost" for different attributes?
Open to any kind of suggestions/improvements. Please suggest.