I have a Rails 3 project running on top of PostgreSQL 9.0.
Use Case: Users can request to follow Artists
by name. To do this, they submit a list of names to a REST resource. If I can't find the Artist
by name in the local collection, I consult last.fm for information about them, and cache that information locally. This process can take some time, so it is delegated to a background job called IndexArtistJob
.
Problem: IndexArtistJob
will be run in parallel. Thus, it is possible that two users may request to add the same Artist
at the same time. Both users should have the Artist
added to their collection, but only one Artist
should end up in the local database.
Relevant portions of the Artist
model are:
require 'services/lastfm'
class Artist < ActiveRecord::Base
validates_presence_of :name
validates_uniqueness_of :name, :case_sensitive => false
def self.lookup(name)
artist = Artist.find_by_name(name)
return artist if not artist.nil?
info = LastFM.get_artist_info(name)
return if info.nil?
# Check local DB again for corrected name.
if name.downcase != info.name.downcase
artist = Artist.find_by_name(info.name)
return artist if not artist.nil?
end
Artist.new(
:name => info.name,
:image_url => info.image_url,
:bio => info.bio
)
end
end
The IndexArtistJob
class is defined as:
class IndexArtistJob < Struct.new(:user_id, :artist_name)
def perform
user = User.find(user_id)
# May return a new, uncommitted Artist model, or an existing, committed one.
artist = Artist.lookup(artist_name)
return if artist.nil?
# Presume the thread is pre-empted here for a long enough time such that
# the work done by this worker violates the DB's unique constraint.
user.artists << artist
rescue ActiveRecord::RecordNotUnique # Lost race, defer to winning model
user.artists << Artist.lookup(artist_name)
end
end
What I'm trying to do here is let each worker commit the new Artist
it finds, hoping for the best. If a conflict does occur, I want the slower worker(s) to abandon the work they did in favor of the Artist
that was just inserted, and add that Artist
to the specified user.
I'm aware of the fact that Rails validators are no substitute for actual data integrity checking at the level of the database. To handle this, I added a unique index on the Artist table's lowercased name field to handle this (and to use for searching). Now, if I understand the documentation correctly, an AR's association collection commits changes to the item being added (Artist
in this case) and the underlying collection in a transaction. But I can't be guaranteed the Artist
will be added.
Am I doing this correctly? If so, is there a nicer way to do it? I feel like structuring it around exceptions accentuates the fact that the problem is one of concurrency, and thus a bit subtle.