0

How can I set up the default_scope in my blogging application so that the index orders the entries by an algorithm defined in the model?

If I were to use a HackerNews-like formula for the ranking algorithm as shown below, how can I define it in my model?

total_score = (votes_gained - 1) / (age_in_hours + 2)^1.5

The votes_gained variable relies on the Active_Record_Reputation_System, and is written as the following in my views:

votes_gained = @post.reputation_value_for(:votes).to_i

Finally, age_in_hours is pretty straight forward

age_in_hours = (Time.now - @post.created_at)/1.hour

How can I use these figures to order my blog posts index? I've been trying to figure out how to define total_score correctly in the model so that I can add it into the default scope as default_scope order("total_score DESC") or something similar. Direct substitution has not worked, and I'm not sure of how to "rephrase" each part of the formula.

How exactly should I define total_score? Thanks much for your insight!

umezo
  • 1,519
  • 1
  • 19
  • 33
  • 1
    The main issue I see here is that, while you need to give SQL the formula so that it can order the records, the formula contains a function call that must be evaluated by ruby, namely `reputation_value_for(:votes)` for each post. SQL can't call ruby to evaluate the function. The formula can only contain terms that SQL can figure out itself, like values stored in columns or simple mathematical expressions involving those values, or even the current time, which you can get by either passing from ruby or using SQL NOW. – cdesrosiers Sep 23 '12 at 17:16
  • Thanks @cdesrosiers. So would a solution be to create a `total_score` column in my Post model? If so, is there an easy way to do this? – umezo Sep 23 '12 at 18:53
  • 1
    You can't store `total_score` itself because it depends on a continuously changing parameter: `age_in_hours`. All you need to store is `votes_gained` which would only need to be updated when someone votes. Does the table have a :vote column? – cdesrosiers Sep 23 '12 at 19:14
  • I see. The voting system has its own tables that reference target and source ids, and so the Post table itself does not have any columns related to votes. I think I can find the votes for Post.find(1) in the RSReputation table using something like RSReputation.where(target_type: "Post", target_id: "1"). Let me see if this works... – umezo Sep 23 '12 at 19:48
  • Btw, which database are you using? – cdesrosiers Sep 23 '12 at 19:57
  • Hi @cdesrosiers. Can you please tell me how I can extract the `value` column from the collection RSReputation.where(target_type:"Post", target_id:"1"), when `.value` is not defined as a method? Thanks very much! – umezo Sep 23 '12 at 20:05

1 Answers1

1

Seeing as how you can't rely on active record to translate the formula into SQL, you have to write it yourself. The only potential concern here is that this is not a database-independent solution.

Since you are using Postgres, you can define your scope as (I haven't tested this yet, so let me know whether it works):

AGE_IN_HOURS = "(#{Time.now.tv_sec} - EXTRACT (EPOCH FROM posts.created_at))/3600"
TOTAL_SCORE = "(rs_reputations.value - 1)/((#{AGE_IN_HOURS}) + 2)^1.5"

default_scope joins("INNER JOIN rs_reputations ON rs_reputations.target_id = posts.id").where("rs_reputations.target_type = 'Post'").order(TOTAL_SCORE)

EDIT: Actually this won't work because, as it stands, Time.now is calculated one time (when the model loads), but it needs to be recalculated each time records are pulled. Use

default_scope lambda { order_by_score }

def self.order_by_score

    age_in_hours = "(#{Time.now.tv_sec} - EXTRACT (EPOCH FROM posts.created_at))/3600"
    total_score = "(rs_reputations.value - 1)/((#{age_in_hours}) + 2)^1.5"

    joins("INNER JOIN rs_reputations ON rs_reputations.target_id = posts.id").where("rs_reputations.target_type = 'Post'").order(TOTAL_SCORE)
end
cdesrosiers
  • 8,862
  • 2
  • 28
  • 33
  • Seems to be working. Thanks! However it seems to be ordered in ascending order. How do I order in the other direction? I just tried changing to `.order("TOTAL_SCORE DESC")` but that doesn't work. Thanks so much for your support. – umezo Sep 23 '12 at 21:06
  • You can add "DESC" to the end of the `total_score` string: "(rs_reputations.value - 1)/((#{age_in_hours}) + 2)^1.5 DESC". You should test this thoroughly to make sure it's not just working for corner cases. – cdesrosiers Sep 23 '12 at 21:09
  • 1
    It just occurred to me that you can't use lambdas with default scope. Change that line to `scope :ordered_by_score, lambda { order_by_score }`. You'll have to explicitly call `Post.ordered_by_score`. – cdesrosiers Sep 23 '12 at 21:14
  • Hi @desrosiers. With the new scope, I get "undefined local variable or method `order_by_score`", even though `order_by_score` is defined in a block directly underneath the scope, as you suggested. – umezo Sep 23 '12 at 21:38
  • 1
    change to `def self.order_by_score` – cdesrosiers Sep 23 '12 at 21:48
  • Looks to be working. Thanks. This would have taken me forever to figure out! – umezo Sep 23 '12 at 21:50
  • No problem! Be sure to test it thoroughly though. Also, once the table gets very large, ordering might become a bit slow, so you might want to look into some kind of caching scheme. – cdesrosiers Sep 23 '12 at 21:52
  • Hi @cdesrosiers. Turns out, with the new scope (instead of default_scope), newly created posts are not added to the rs_reputations table, and so do not show up in the index view. However, if I change the index definition in my controller from Post.ordered_by_score to Post.all, the newly added post shows up in the rs_reputations table. Do you know what may be causing this? – umezo Sep 24 '12 at 01:42
  • Yeah, that happens because the scope `ordered_by_score` utilizes an INNER JOIN of the Posts table with the RSReputations table, wherein post records are paired with rs_reputation records that have target_id = post.id. But INNER JOIN will exclude any posts or rs_reputations that cannot be paired with a record in the other table. So if new posts are not getting added to rs_reputation, they will be excluded altogether from the scope. If you want to recover those posts, you need to use LEFT OUTER JOIN, but these records will not and cannot be sorted correctly without rs_reputation records. – cdesrosiers Sep 24 '12 at 02:43
  • Thanks @cdesrosiers. LEFT OUTER JOIN doesn't seem to change the results as far as I can tell. I need to read up on joins. Do you have any recommended resources? – umezo Sep 24 '12 at 03:11
  • Perhaps they're still excluded because the scope is trying to order based on their rs_reputation records, which aren't there. Look at http://www.w3schools.com/sql/sql_join.asp – cdesrosiers Sep 24 '12 at 03:14
  • HI @cdesrosiers, quick update on what I ended up doing. Couldn't figure out how t get the `ordered_by_score` to include newly created posts, but I decided to redirect all newly created posts to a separate "newest" action that orders by `created_at`. That way, it's included into the rs_reputations table automatically after every create. – umezo Oct 06 '12 at 21:43