40

I'm using Alembic as a database migration tool for a Python project. When I run a command like this:

alembic revision -m "adding a column"

...it will add a new file called alembic/versions/xxxxxxxxxxxx_adding_a_column.py where xxxxxxxxxxxx is a randomly generated, 12-digit hash.

From the perspective of making things human-readable, this is a little bit problematic, because it means that when looking at the alembic/versions directory, all the files will appear in random order, rather than in sequential / chronological order.

Are there any options in Alembic to ensure these prefix revision IDs are sequential? I suppose I could rename the files manually and then update the references, but I'm wondering if there's already a feature like that baked in.

soapergem
  • 9,263
  • 18
  • 96
  • 152

5 Answers5

70

By the sounds of it, you are more interested in sequentially listed revision files rather than sequentially ordered revision ids. The former can be achieved without any change to how the revision ids are generated.

The alembic.ini file that is generated when you run alembic init alembic has a section that configures the naming of the revision files:

# template used to generate migration files
# file_template = %%(rev)s_%%(slug)s

And here is the explanation from the docs:

file_template - this is the naming scheme used to generate new migration files. The value present is the default, so is commented out. Tokens available include:

  • %%(rev)s - revision id
  • %%(slug)s - a truncated string derived from the revision message
  • %%(year)d, %%(month).2d, %%(day).2d, %%(hour).2d, %%(minute).2d, %%(second).2d - components of the create date, by default datetime.datetime.now() unless the timezone configuration option is also used.

So adding file_template = %%(year)d-%%(month).2d-%%(day).2d_%%(rev)s_%%(slug)s to alembic.ini would name your revision like 2018-11-15_xxxxxxxxxxxx_adding_a_column.py.

I found this issue: https://bitbucket.org/zzzeek/alembic/issues/371/add-unixtime-stamp-to-start-of-versions which pointed me in the right direction.

A comment from from that issue:

timestamps don't necessarily tell you which file is the most "recent", since branching is allowed. "alembic history" is meant to be the best source of truth on this.

So, the file naming solution will not guarantee that migrations are ordered logically in the directory (but will help IMO). The same argument could be made against having sequential ids.

If you do want to specify your own revision identifier, use the --rev-id flag on the command line.

E.g.:

alembic revision -m 'a message' --rev-id=1

Generated a file called 1_a_message.py:

"""a message

Revision ID: 1
Revises:
Create Date: 2018-11-15 13:40:31.228888

"""
from alembic import op
import sqlalchemy as sa


# revision identifiers, used by Alembic.
revision = '1'
down_revision = None
branch_labels = None
depends_on = None


def upgrade():
    pass


def downgrade():
    pass

So you can definitely manage the revision identifiers yourself. It would be trivial to write a bash script to trigger your revision generation, automatically passing a datetime based rev_id, e.g. --rev-id=<current datetime> to govern order listed in the directory.

If the revision id isn't specified, the function rev_id() found at alembic.util.langhelpers is called:

def rev_id():
    return uuid.uuid4().hex[-12:]

Function calls to rev_id() are hard-coded in the alembic source, so short of monkey-patching the function, it will be difficult to override the behavior. You could create a fork of the library and redefine that function or make the function that it calls for id generation configurable.

SuperShoot
  • 9,880
  • 2
  • 38
  • 55
  • 6
    This was fantastic information. Thank you. This helped me make it easier to fix a recurring problem we have. If someone forgets to downgrade before checking out another feature branch, `alembic history -i` will explode when its unable to find the head. Prefixing the date makes it easier to locate the culprit. A built-in way to timestamp the revid is still the dream, since that's what actually goes into alembic_version. We'd taken up using `--rev-id $(date -u +"%Y%m%dT%H%M%SZ")` with `alembic revision`. – John Christopher Jones Jul 17 '19 at 00:29
  • 1
    Note: v1.8 added `%%(epoch)s` to the list of `file_template` tokens – odigity Jan 11 '23 at 20:13
  • It's a shame there's not a built-in option for simple auto-incrementing integers starting at 1. – odigity Jan 11 '23 at 20:13
11

I found how to do it in my case without additional bash scripts, just some mutation magic in env.py. Maybe it will help somebody.

Alembic has a powerful feature with customizing generated revisions so we can write override at this level:

# env.py
def process_revision_directives(context, revision, directives):
    # extract Migration
    migration_script = directives[0]
    # extract current head revision
    head_revision = ScriptDirectory.from_config(context.config).get_current_head()
    
    if head_revision is None:
        # edge case with first migration
        new_rev_id = 1
    else:
        # default branch with incrementation
        last_rev_id = int(head_revision.lstrip('0'))
        new_rev_id = last_rev_id + 1
    # fill zeros up to 4 digits: 1 -> 0001
    migration_script.rev_id = '{0:04}'.format(new_rev_id)

...
# then use it context.configure
context.configure(
  ...
  process_revision_directives=process_revision_directives,
)

If you also want to use it for revisions created without --autogenerate you should set revision_environment to true in alembic.ini

Dima Boger
  • 111
  • 2
  • 4
  • 1
    Great answer. To prepend the revision number and keep the original uuid from alembic, I have modified your function like this: `last_rev_id = int(head_revision[:4].lstrip("0"))` and `migration_script.rev_id = f"{new_rev_id:04}_{uuid.uuid4().hex[-12:]}"`. – Lionel Hamayon Jun 03 '21 at 13:45
  • quick note, to get `ScriptDirectory` do the following: `from alembic.script import ScriptDirectory` – rmehlinger Oct 28 '22 at 21:31
  • Great answer! The `lstrip('0')` is not needed – Levi Mar 10 '23 at 08:31
  • 1
    Also if you have multiple `context.configure`, you should add to all of them – Groosha Mar 24 '23 at 10:44
5

I made a script to autoincrement the revision number based off of how many migrations already exist maching the ####_ pattern. Here's a TLDR version. I you save this as migrations.sh and change the path in line 2

#!/usr/bin/env bash
NEXT_ID=`ls kennel/db/versions/* | grep -P '/\d{4}_.*\.py$' | wc -l`
alembic revision -m $@ --rev-id=`printf "%04d" ${NEXT_ID}`

Then you can use it like:

./migrations.sh migration_name
# or 
./migrations.sh migration_name --autogenerate

The full script has documentation and uses defaults to --autogenerate which can be disabled using an --empty flag. https://gist.github.com/chriscauley/cf0b038d055076a2a30de43526d4150e

chriscauley
  • 19,015
  • 9
  • 33
  • 33
2

while i don't need migration branching i use this

@writer.rewrites(ops.MigrationScript)
def revid_increment(ctx: migration.MigrationContext, revisions: tuple, op: ops.MigrationScript):
    op.rev_id = '{0:04}'.format(len(tuple(ctx.script.walk_revisions())) + 1)
    return op

it makes it easy to replace the current rev_id naming scheme, add timestamp, date, whatever ...

mani
  • 149
  • 1
  • 4
1

Yeah but by using date and time

Below are the dynamic variables that are available in alembic

file_template - this is the naming scheme used to generate new migration files. The value present is the default, so is commented out.

Tokens available include:

%%(rev)s - revision id
%%(slug)s - a truncated string derived from the revision message
%%(year)d, %%(month).2d, %%(day).2d, %%(hour).2d, %%(minute).2d, %%(second).2d - components of the create date as returned by datetime.datetime.now()

So for example you can use the below configuration for sequential filenames

# template used to generate migration files
file_template = %%(year)d-%%(month).2d-%%(day).2d-%%(hour).2d-%%(minute).2d-%%(second).2d_%%(rev)s_%%(slug)s

This will generate the following output

YYYY-mm-dd-HH-MM-SS_<rev>_<message_slug>
Anand Tripathi
  • 14,556
  • 1
  • 47
  • 52
  • 2
    This is fantastic. I was able to use this, and batch-rename my old version files as well, and it seemed like alembic was still able to migrate just fine. – DeusXMachina Oct 12 '21 at 23:01
  • @DeusXMachina "batch-rename"? Sounds good. Does alembic give such an option/tool? – jkulak Oct 17 '22 at 23:12
  • @jkulak Negative, I had to write something myself to do that. I basically used a python script and some bash to get all the git commit times of each of the files, and use that to rename the file to the git commit date (which is a close enough proxy). I then manually cleaned up any collisions. – DeusXMachina Oct 17 '22 at 23:15