Per this blog post: https://www.elastic.co/blog/changing-mapping-with-zero-downtime/, I am using the recommended best practice for updating an index in production with zero downtime using aliases. Despite this, we are periodically seeing "index missing" exceptions across our application while an update is running. I cannot seem to diagnose this behavior, and I am not sure what the issue could be.
Current Process
- Determine the index being targeted by the default alias (there's at most 1)
- Sequence the current index name by appending/incrementing a counter: index-name-v1
- Create a new index and populate data
- Update the alias in one operation: remove the old index and add the new index
- Delete the index which is -2 versions from the now current index -- This is done to ensure that an index which was returning data prior to the alias update is not deleted
Despite all this, we're getting random and regular index missing errors when querying data. The alias is never removed, only updated atomically. Is there something flawed with this approach which I am not considering?
public virtual void SwapAlias(string aliasName, string oldIndexName, string newIndexName)
{
Client.Alias(a =>
{
a.Add(add => add.Alias(aliasName).Index(newIndexName));
if (oldIndexName != null && oldIndexName != newIndexName)
a.Remove(remove => remove.Alias(aliasName).Index(oldIndexName));
return a;
});
}