The General Process
We create a baseline for a particular version (say, v1.0
). A baseline includes one complete schema creation script, as well an upgrade script from allowed previous versions, if any (more on that in a moment). So for v1.0
, we'd have just one script:
baseline-v1.0.sql
From that baseline, we create incremental change scripts as we work from the previous baseline. These scripts are created in a way that they are reentrant, so that they can be run safely multiple times (where the first time only does any actual work; see the next paragraph on a suggestion how). We just create a file for each change script with the baseline name and a timestamp (which we call the version). So for example, say we create two change scripts after a baseline. We'd have the following files:
baseline-v1.0.sql (for creating new installations)
baseline-v1.0-201211071220.sql (created on Nov. 7, 2012 at 12:20 PM UTC)
baseline-v1.0-201211122019.sql (created on Nov. 12, 2012 at 8:00 PM UTC)
We create a schema_version
table that has two columns: baseline
and version
. baseline
is some label (such as v1.0
mentioned above), and version
is just a timestamp of when the change script was created (we chose to do this because creating arbitrary version numbers created annoying administrative overhead, where a timestamp was easy to use). So before running the change script, we check to see if the change script has been applied yet, by querying for it by baseline
and version
. If it's already present, just return out of the script or whatever. Otherwise, apply the change and insert into the schema_version
table to mark the change script completed.
Example change script:
-- Created by <developer> on Nov. 7, 2012 at 12:20 PM UTC
declare @schema_baseline varchar(10), @schema_version varchar(12)
set @schema_baseline = 'v1.0'
set @schema_version = '201211071210'
if exists (select 1 from schema_version where baseline = @schema_baseline and version = @schema_version = @schema_version) return 0
-- begin change script
-- place your schema changes here
-- end change script
insert into schema_version(@schema_baseline, @schema_version)
Now, when we actually install the software, we run the relevant baseline
script. As we upgrade that version, we just apply the change scripts in order.
When we hit a significant milestone in our product development phase, we create a new baseline. So, we create a new baseline script (again, this is a snapshot of the DB as a baseline), plus an upgrade script from the previous baseline. So let's say we have a new baseline, v2.0
, we'd have the following files:
baseline-v2.0.sql (for creating new installations)
baseline-v2.0-upgrade-v1.0.sql (for upgrading from v1.0)
Then the process continues.
How We Apply Changes
The scripts are all kept in source control. We do have a tool that packages these files and automatically upgrades databases, which our support and installation teams use. The tool figures out the current baseline of the target database, and asks the user if they wish to upgrade to the baseline in the package. If they do, and there is a valid upgrade path from the current version, it applies the upgrade script, and updates the schema_version.baseline
, and deletes all entries for change scripts from the previous baseline. If the database is new, it applies the regular baseline script. Either way, after the baseline is achieved, it applies all change scripts from the baseline that are present in the package, one at a time, in order, in a transaction. If a particular change script fails, it rolls back the last set of changes and errors out. We look at the log, fix any issues, then rerun the package again. At that point, it should just pick up at the last change script that succeeded, saving time.
Automation and Diff Tools
We do not allow diff tools to upgrade production databases directly. It's just too risky. We do use diff tools, of course, to help create our upgrade and change scripts, but once we have them, we comb through them, massage them, test them, etc., then create the upgrade or change script according to the specs above. We do use tools/shell scripts to create the change script files and put the boiler plate schema_version
checking.
Caveats
It's actually pretty straight-forward and it works well. The only time it really gets tricky is with branches. For the most part, branches are handled well. If we need a change script for a particular branch's work, it will fold into the mainline very well once we merge the branch back in. No problem. Where it gets tricky is when two branches try to do similar things, or where one branch relies on another. That's mostly a process and planning issue, though. If we get stuck in such a situation, we just create a new baseline (say v2.1
), then update the branches accordingly.
Another thing to keep in mind is if an installation wants to be upgraded from one baseline to another, it has to apply all outstanding changes for the current baseline, before we upgrade to the new one. In other words, we don't let installations jump right from where ever they are to the next baseline (unless, of course, they're already at the most recent version for the current baseline).