2

I need to retrieve documents in elasticsearch not by the default scoring function used (such as tfidf etc) but just by word frequency or term frequency (not idf etc). Is there a way to modify it? Can I do it in python?

ayush singhal
  • 1,879
  • 2
  • 18
  • 33
  • Is there some way to only use coordination factor (coord) for scoring and switch off others. i think I only need coordination factor (coord) for my type of search . tfidf is penalizing the terms because they are appearing in all documents. – ayush singhal Apr 12 '17 at 16:11
  • What ES version is this? And which word/term you want to get the frequency for? – Andrei Stefan Apr 12 '17 at 20:47

1 Answers1

0

You can use the constant_score when you just don’t care about TF/IDF

{
    "query": {
        "bool": {
            "constant_score": {
                "query": {
                    "match": {
                        "description": "any word"
                    }
                }
            }
        }
    }
}
RoiHatam
  • 876
  • 10
  • 19
  • I want those documents to rank high which match most in terms in the query. if query has 3 same words such as "tin tin tin", then documents with 3 "tin"s should rank higher than those with 2 or 1 "tin" or even 4 or 5 or more "tin". And you can assume that all docs in ES contain this word "tin". – ayush singhal Apr 12 '17 at 15:26
  • I'm sorry, the tf/idf will do that but you want another algorithm. – RoiHatam Apr 12 '17 at 15:54
  • Is there some way to only use coordination factor (coord) for scoring and switch off others. i think I only need coordination factor (coord) for my type of search . tfidf is penalizing the terms because they are appearing in all documents. – ayush singhal Apr 12 '17 at 16:10