To allow for zero-downtime index changes even as the search system is being updated with new user-generated content you can use the following strategy:
Define aliases for both read and write actions that will point to an ES index. When a Model is updated, look up the model_write alias and use it to write to all tracked indices, which will include both the currently active ones and any that are being built in the background.
class User < ActiveRecord::Base
def self.index_for_search(user_id)
Timeout::timeout(5) do
user = User.find_by_id(user_id)
write_alias = Tire::Alias.find("users_write")
if write_alias
write_alias.indices.each do |index_name|
index = Tire::Index.new(index_name)
if user
index.store user
else
index.remove 'user', user_id
end
end
else
raise "Cannot index without existence of 'users_write' alias."
end
end
end
end
Now, when you want to do a full index rebuild (or initial index creation), add a new index, add it to the alias, and start building it knowing that any active users will be adding their data to both indices simultaneously. Continue to read from the old index until the new one is built, then switch the read alias.
class SearchHelper
def self.set_alias_to_index(alias_name, index_name, clear_aliases = true)
tire_alias = Tire::Alias.find(alias_name)
if tire_alias
tire_alias.indices.clear if clear_aliases
tire_alias.indices.add index_name
else
tire_alias = Tire::Alias.new(:name => alias_name)
tire_alias.index index_name
end
tire_alias.save
end
end
def self.reindex_users_index(options = {})
finished = false
read_alias_name = "users"
write_alias_name = "users_write"
new_index_name = "#{read_alias_name}_#{Time.now.to_i}"
# Make new index for re-indexing.
index = Tire::Index.new(new_index_name)
index.create :settings => analyzer_configuration,
:mappings => { :user => user_mapping }
index.refresh
# Add the new index to the write alias so that any system changes while we're re-indexing will be reflected.
SearchHelper.set_alias_to_index(write_alias_name, new_index_name, false)
# Reindex all users.
User.find_in_batches do |batch|
index.import batch.map { |m| m.to_elasticsearch_json }
end
index.refresh
finished = true
# Update the read and write aliases to only point at the newly re-indexed data.
SearchHelper.set_alias_to_index read_alias_name, new_index_name
SearchHelper.set_alias_to_index write_alias_name, new_index_name
ensure
index.delete if defined?(index) && !finished
end
A post describing this strategy can be found here: http://www.mavengineering.com/blog/2014/02/12/seamless-elasticsearch-reindexing/