2

Some of my search results returns a total of over 10k documents, varying from a high score (in my most recent search, ~75) to a very low score (less than 5). Other queries return a high score of ~20 and a low score of ~1.

Does anyone have a good solution for trimming off the less relevant documents? A java or query implementation would work. I've thought about using min_score, but i'm wary of that since it has to be a constant number, and some of the scores of my responses are a lot closer than the above. I suppose I could come up with some formula based off of the returned scores to create a cutoff for every response, but I was curious if anyone has come up with a solution to a similar use case?

JR3652
  • 435
  • 1
  • 4
  • 13
  • You can use `_size` parameter to limit the total number or documents returned by the query. By default elastic sort by score in descending order. – Nishant Jan 29 '19 at 02:54
  • are the documents from the tail still relevant for you? If they are not, you should investigate on why do your query return unrelevant content (from your judgment), this is the best way to trim the result list by privileging the precision of the recall. If the 10k docs are relevant, you don't need to trim the result list, just check that your users find what they want in the firsts pages =) And if you still want to trim based on score, you will certainly have to make the first query with size 0 to compute the best score, then re-query with a min_score clause based on the initial best score. – Pierre Mallet Jan 29 '19 at 09:04
  • @PierreMallet thats what I ended up doing. Just did an query with size zero, took the max score from that and now I'm just taking a percentage of that max score as my min score for the actual query – JR3652 Jan 29 '19 at 14:46

0 Answers0