3

In our project we have multiple databases and we use alembic for migration.

I know that alembic is supposed to be used only for database structure migration, but we also use it for data migration as it's convenient to have all database migration code in one place.

My problem is that alembic works on one database at a time. So if I have databases DB1 and DB2, alembic will first run all migrations for DB1 and after that all migrations for DB2.

The problems start when we migrate data between databases. Say, if in I'm in revision N of DB1 try to access data in DB2, the migration can fail because DB2 can be on revision zero or N-X.

Question: is it possible to run alembic migrations one by one for all databases instead of running all migrations for DB1 and then running all for DB2?

My current env.py migration function:

def run_migrations_online():
    """
    for the direct-to-DB use case, start a transaction on all
    engines, then run all migrations, then commit all transactions.
    """

    engines = {}
    for name in re.split(r',\s*', db_names):
        engines[name] = rec = {}

        cfg = context.config.get_section(name)
        if not 'sqlalchemy.url' in cfg:
            cfg['sqlalchemy.url'] = build_url(name)

        rec['engine'] = engine_from_config(
            cfg,
            prefix='sqlalchemy.',
            poolclass=pool.NullPool)

    for name, rec in engines.items():
        engine = rec['engine']
        rec['connection'] = conn = engine.connect()
        rec['transaction'] = conn.begin()

    try:
        for name, rec in engines.items():
            logger.info("Migrating database %s" % name)
            context.configure(
                connection=rec['connection'],
                upgrade_token="%s_upgrades" % name,
                downgrade_token="%s_downgrades" % name,
                target_metadata=target_metadata.get(name))
            context.run_migrations(engine_name=name)

        for rec in engines.values():
            rec['transaction'].commit()
    except:
        for rec in engines.values():
            rec['transaction'].rollback()
        raise
    finally:
        for rec in engines.values():
            rec['connection'].close()
Minras
  • 4,136
  • 4
  • 18
  • 18
  • When you say migrating data do you mean moving data about in say columns or do you mean copying data from db1 to db2? – Ciaran Liedeman Sep 10 '15 at 14:26
  • @Anarci, moving data within the database is not a problem as all tables/columns in the database are on the same revision. The problem is indeed moving data between databases, as when I'm doing something in DB1 with revision N, the revision of DB2 is unknown. – Minras Sep 11 '15 at 15:07
  • Not sure about what you mean by **run migrations one by one for all databases** , for migrating different versions of schema to different databases, you can write your own script and run migration programmatically, [my answer in this SO thread](https://stackoverflow.com/a/69925218/9853105) has an example, hope that helps – Ham Nov 11 '21 at 08:36

1 Answers1

1

While I haven't tested this myself, I have been reading https://alembic.sqlalchemy.org/en/latest/api/script.html

It seems feasible that you could use ScriptDirectory to iterate through all the revisions, check if each db needs to apply that revision, and then rather than context.run_migrations you could manually call command.upgrade(config, revision) to apply that one revision.

Tonechas
  • 13,398
  • 16
  • 46
  • 80
karina
  • 805
  • 5
  • 10