Having the same pb publishing to a docker swarm, I put here a solution partially grabbed from others.
Rails has already a mechanism to detect concurrent migrations by using a lock on the database. But it triggers ConcurrentException where it should just wait.
One solution is then to have a loop, that whenever a ConcurrentException is thrown, just wait for 5s et then redo the migration.
This is especially important that all containers perform the migration as the migration fails, all containers must fails.
Solution from coffejumper
namespace :db do
namespace :migrate do
desc 'Run db:migrate and monitor ActiveRecord::ConcurrentMigrationError errors'
task monitor_concurrent: :environment do
loop do
puts 'Invoking Migrations'
Rake::Task['db:migrate'].reenable
Rake::Task['db:migrate'].invoke
puts 'Migrations Successful'
break
rescue ActiveRecord::ConcurrentMigrationError
puts 'Migrations Sleeping 5'
sleep(5)
end
end
end
end
And sometimes you have other processes you want to execute also one by one to perform the migration like after_party, cron setup, etc... The solution is then to use the same mechanism as Rails to embed rake tasks around a database lock:
Below, based on Rails 6 code, the migrate_without_lock performs the needed migrations while with_advisory_lock
gets database lock (triggering ConcurrentMigrationError if lock cannot be acquired).
module Swarm
class Migration
def migrate
with_advisory_lock { migrate_without_lock }
end
private
def migrate_without_lock
**puts "Database migration"
Rake::Task['db:migrate'].invoke
puts "After_party migration"
Rake::Task['after_party:run'].invoke
...
puts "Migrations successful"**
end
def with_advisory_lock
lock_id = generate_migrator_advisory_lock_id
MyAdvisoryLockBase.establish_connection(ActiveRecord::Base.connection_config) unless MyAdvisoryLockBase.connected?
connection = MDAdvisoryLockBase.connection
got_lock = connection.get_advisory_lock(lock_id)
raise ActiveRecord::ConcurrentMigrationError unless got_lock
yield
ensure
if got_lock && !connection.release_advisory_lock(lock_id)
raise ActiveRecord::ConcurrentMigrationError.new(
ActiveRecord::ConcurrentMigrationError::RELEASE_LOCK_FAILED_MESSAGE
)
end
end
MIGRATOR_SALT = 1942351734
def generate_migrator_advisory_lock_id
db_name_hash = Zlib.crc32(ActiveRecord::Base.connection_config[:database])
MIGRATOR_SALT * db_name_hash
end
end
# based on rails 6.1 AdvisoryLockBase
class MyAdvisoryLockBase < ActiveRecord::AdvisoryLockBase # :nodoc:
self.connection_specification_name = "MDAdvisoryLockBase"
end
end
Then as before, do a loop to wait
namespace :swarm do
desc 'Run migrations tasks after acquisition of lock on database'
task migrate: :environment do
result = 1
(1..10).each do |i|
**Swarm::Migration.new.migrate**
puts "Attempt #{i} sucessfully terminated"
result = 0
break
rescue ActiveRecord::ConcurrentMigrationError
seconds = rand(3..10)
puts "Attempt #{i} another migration is running => sleeping #{seconds}s"
sleep(seconds)
rescue => e
puts e
e.backtrace.each { |m| puts m }
break
end
exit(result)
end
end
Then in your startup script just launch the rake tasks
set -e
bundle exec rails swarm:migrate
exec bundle exec rails server -b "0.0.0.0"
At the end, as your migrations tasks are run by all containers, they must have a mechanism to do nothing when it's already done. (like does db:migrate)
Using this solution, the order in which Swarm launches containers doesn't matter anymore AND if something goes wrong, all containers know the problem :-)