11

In my particular use case, the IDF-factor that gets calculated as part of the TF-IDF algorithm messes up the scoring for my queries. Basically, I want the queries to only take the term frequency into account. Is it possible to disable the IDF factor, i.e set it to 1, for a particular index? I have looked into the similarity module (in version 0.90.X), but haven't really found anything that could help; same goes for the function_score query. Do I need to write a custom Similarity class in java? Or is there a plugin for what I'm trying to achieve?

GlurG
  • 237
  • 2
  • 10
  • I believe it's connected with my question http://stackoverflow.com/questions/22016735/elasticsearch-similary-for-countries, I tried to use DFR but with no success – Alex Feb 25 '14 at 16:39

1 Answers1

1

What about constant_score query?

See http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/ignoring-tfidf.html

Don't hesitate to use ?explain=true to see how scoring is working.

As you can here without constant_filter:

With IDF

And with constant_filter query (that wraps your real query):

Without IDF

Thomas Decaux
  • 21,738
  • 2
  • 113
  • 124
  • 3
    Since "constant_score" does turn off both TF and IDF, I'm pretty sure that the result will be same with the result when use Filter. @GlurG seems like wanted to turn off only IDF while TF is turned on. Do you have any idea? – humbroll Apr 14 '16 at 00:13
  • 2
    H, you mean change the scoring/ranking formulae? This page should help => https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-similarity.html. – Thomas Decaux Apr 14 '16 at 17:59