0

I have three models, User, Venue, Rating, as follows:

class User < ActiveRecord::Base
  has_many :ratings
end

class Venue < ActiveRecord::Base
  has_many :ratings
end

class Rating < ActiveRecord::Base
  belongs_to :user
  belongs_to :venue
end

Users are able to rate venues from 0-5. Users may rate the venue as frequently as they want, and often their are different ratings from the same user on the same venue created within minutes of each other.

I want to be able to provide an average rating of a venue from the past hour, however I only wish to consider a single rating per user from the time period. So if a user rates the same venue multiple times in the past hour, only their most recent rating will be accounted for.

Currently I have this:

class Venue < ActiveRecord::Base
  has_many :ratings

  def past_hour_average
    ratings = self.ratings.where(:created_at => 1.hour.ago..Time.now).uniq_by(&:user_id)
    # loop through each record and compute average
    sum = 0
    ratings.each do |rating|
      sum += rating.value
    end
    return sum / ratings.size
  end
end

This method seems inefficient however. Every time I want the rating of a venue I have to calculate it. Under the assumption that there will be many users frequently rating a single venue what would a better approach to calculating an average rating be?

steve
  • 2,488
  • 5
  • 26
  • 39

2 Answers2

3

I think this should work:

def past_hour_average
  ratings = self.ratings.where(created_at: 1.hour.ago..Time.now).order(:created_at).group(:user_id)

  ratings.sum(:value) / ratings.count
end

You might just need to reverse the ordering if this returns the oldest rather than the most recent rating from each user.

This is doing exactly what your code does ... it just gets the database to do the sum for you rather than calculating it manually within your ruby code.

Jon
  • 10,678
  • 2
  • 36
  • 48
2

Are you sure that the efficiency of this calculation will even be an issue? I consider it very unlikely, unless your site experiences very, very heavy read load.

But it does really matter, here is one simple thing you can do which might help a little bit:

 def past_hour_average
   @past_hour_average ||= begin
     # calculation here
   end
 end

That will ensure that the calculation is not done more than once for a single Venue in the space of a single request.

If you need better than that, and you have actually checked to make sure this is really an issue, you could cache the results of the calculation and invalidate the cache if it is older than a certain number of minutes. I wouldn't bother with MemCached (et al) here. I would just do something like:

 class Venue
   @@avg_rating_cache = {}

   def past_hour_average
     if avg,time = @@avg_rating_cache[self.id] && time > (Time.now - 10.minutes)
       @@avg_rating_cache[self.id] = [avg, Time.now]
       return avg
     end

     value = calculation_here
     @@avg_rating_cache[self.id] = [value, Time.now]
     value
   end
 end

That will cache results directly in the memory of each application process (so there will be no extra overhead/latency from accessing a MemCached cache). If you have more than about 10,000 venues, you will need to evict entries from the cache as new ones are added to prevent excessive memory usage.

Alex D
  • 29,755
  • 7
  • 80
  • 126
  • I am not familiar with cacheing in rails but will look into. You're probably right about not having to worry about the load also. Thanks! – steve Jan 12 '14 at 22:31