2

In my Ruby on Rails application I am building a calendar with Fullcalendar JQuery plugin for the views and using the Icecube gem to calculate event recurrence. I also use Thinking Sphinx for search.

The users can have many calendars each with their own set of recurring events. Currently this requires me to have to load all of the users calendars, and all of the calendar events. Then I have to calculate all of the event occurrences that take place within a certain month.

I'd like to cache the calculating of the occurrences taking place, and was thinking I could serialize an array of the occurrences within an events db column.

Then I would only need to load the users calendars, and any of the events that have an occurrence within the month.

I'm not sure how or even if I can get Thinking Sphinx to search a serialized array though. Is this possible, and how would I define that in my models define_index block?

Is there a more efficient way that I can cache / search event occurrence times? Or is there any other solutions people have used to solve a similar issue of searching Arrays of Time objects?

JDutil
  • 3,622
  • 25
  • 32

2 Answers2

0

You don't need anything fancy, but you do need to properly plan out your data structures.

If you stick to a strict one-event-one-record methodology you can optimize your searching much better. The easy approach is to introduce a calendar month column that can be indexed and use that in a scope for fetching events:

@events = @user.events.where(:yearmonth => yearmonth)

The yearmonth column contains YYYYMM formatted values. You can also index by date or week if required.

A recurring event would have lots of child events, typically one event being parented to the previous instance of that event type. You would have to repeat this for a sufficiently long period of time, which can lead to a large number of records, but pre-computing it is probably easier than loading in all events for a user and trying to figure out what happens in a given calendar month.

Desktop or phone-based calendar applications can keep the entire event dataset in memory because it's usually only for one user at a time. In a multi-user system, or a multi-user * multi-tenant system, you will have way too much data to pull that off unless you create a specialized persistent server process to perform that function.

If you need performance and scale at that point it's actually not that hard to roll your own persistent server process that speaks a standard protocol like Memcached and use that as a kind of NoSQL data store.

tadman
  • 208,517
  • 23
  • 234
  • 262
  • I've considered using a one-event-one-record methodology, but I don't think that will work for our application in the long run. At least not using our MySQL database. It would generate far too many records as more and more clients come on board. Ideally we would be able to keep all occurrences under a single event record, and just cache occurrences taking place within the past or upcoming year. I'm not necessarily looking for the easiest solution, as we would like to make sure that we do it right the first time to scale without running out of database records. – JDutil Dec 06 '11 at 22:46
  • I don't know what you mean my "running out of database records" as on modern hardware with a well designed schema that would take forever and a half. How many users are you going to have? 10,000? Even if they had dozens of entries per day you would still have a hard time putting a dent in the storage of a 40GB database instance. Whatever structure you use, you should benchmark. The results can be surprising. – tadman Dec 07 '11 at 15:33
0

The trick to getting Sphinx to be able to search the cached occurrences was to use a comma separated list of timestamps. This allows you to define an index of a :multi type like so:

define_index do
  indexes :description
  indexes :name, :sortable => true
  indexes :location, :sortable => true
  has :calendar_id
  has :client_id
  has :end_at
  has :start_at
  has "calendar_events.occurrence_cache", :as => :occurrence_cache, :type => :multi 
  set_property :delta => true
end

Note that it's important to use the table name when defining the multi type otherwise I wasn't getting any results returned.

Now I can just use a sphinx scope to only find events with an occurrence within a timestamp range rather than having to load them all and determine their occurrences.

sphinx_scope(:ts_by_occurrence_at) do |occurrence_at_or_range|
  {:with => {:occurrence_cache => occurrence_at_or_range}}
end
JDutil
  • 3,622
  • 25
  • 32