2

I have a list of documents each having a relevance score for a search query. I need older documents to have their relevance score dampened, to try to introduce their date in the ranking process. I already tried fiddling with functions such as 1/(1+date_difference), but the reciprocal function is too discriminating for close recent dates.

I was thinking maybe a mathematical function with range (0..1) and domain(0..x) to amplify their score, where the x-axis is the age of a document. It's best to explain what I further need from the function by an image:

Benjamin Bannier
  • 55,163
  • 11
  • 60
  • 80
dscer
  • 228
  • 4
  • 11

4 Answers4

1

If a simple 1/(1+x) decreases too quickly too soon, a sigmoid function like 1/(1+e^-x) or the error function might be better suited to your purpose. Let the current date be somewhere in the negative numbers for such a function, and you can get a value that is current for some configurable time and then decreases towards a base value.

thiton
  • 35,651
  • 4
  • 70
  • 100
1

Decaying behavior is often modeled well by an exponentional function (many decaying processes in nature also follow it). You would use 2 positive parameters A and B and get

y(x) = A exp(-B x)

Since you want a y-range [0,1] set A=1. Larger B give slower decays.

Benjamin Bannier
  • 55,163
  • 11
  • 60
  • 80
1
log((x+1)-age_of_document)

Where the base of the logarithm is (x+1). Note the x is as per your diagram and is the "threshold". If the age of the document is greater than x the score goes negative. Multiply by the maximum possible score to introduce scaling.

E.g. Domain = (0,10) with a maximum score of 10: 10*(log(11-x))/log(11)

troelskn
  • 115,121
  • 27
  • 131
  • 155
satish b
  • 392
  • 1
  • 2
  • 10
0

A bit late, but as thiton says, you might want to use a sigmoid function instead, since it has a "floor" value for your long tail data points. E.g.:

0.8/(1+5^(x-3)) + 0.2 - You can adjust the constants 5 and 3 to control the slope of the curve. The 0.2 is where the floor will be.

troelskn
  • 115,121
  • 27
  • 131
  • 155