1

Background

I maintain a Python application that automatically applies SQL schema migrations (adding/removing tables and columns, adjusting the data, etc) to our database (SQL2016). Each migration is executed via PyODBC within a transaction so that it can be rolled back if something goes wrong. Sometimes a migration requires one or more batch statements (GO) to execute correctly. Since GO is not actually a T-SQL command but rather a special keyword in SSMS, I've been splitting each SQL migration on GO and executing each SQL fragment separately within the same transaction.

import pyodbc
import re

conn_args = {
    'driver': '{ODBC Driver 17 for SQL Server}',
    'hostname': 'MyServer',
    'port': 1298,
    'server': r'MyServer\MyInstance',
    'database': 'MyDatabase',
    'user': 'MyUser',
    'password': '********',
    'autocommit': False,
}
connection = pyodbc.connect(**conn_args)
cursor = connection.cursor()

sql = '''
    ALTER TABLE MyTable ADD NewForeignKeyID INT NULL FOREIGN KEY REFERENCES MyParentTable(ID)
    GO
    UPDATE MyTable
    SET NewForeignKeyID = 1
'''
sql_fragments = re.split(r'^\s*GO;?\s*$', sql, flags=re.IGNORECASE|re.MULTILINE)

for sql_frag in sql_fragments:
    cursor.execute(sql_frag)
    # Wait for the command to complete.  This is necessary for some database system commands
    # (backup, restore, etc).  Probably not necessary for schema migrations, but included
    # for completeness.
    while cursor.nextset():
        pass

connection.commit()

Problem

SQL statement batches aren't being executed like I expected. When the above schema migration is executed in SSMS, it succeeds. When executed in Python, the first batch (adding the foreign key) executes just fine, but the second batch (setting the foreign key value) fails because it isn't aware of the new foreign key.

('42S22', "[42S22] [FreeTDS][SQL Server]Invalid column name 'NewForeignKeyID'. (207) (SQLExecDirectW)")

Goal

Execute a hierarchy of SQL statement batches (i.e. where each statement batch depends upon the previous batch) within a single transaction in PyODBC.

What I've Tried

  • Searching the PyODBC documentation for information on how PyODBC supports or doesn't support batch statements / the GO command. No references found.

  • Searching StackOverflow & Google for how to batch statements within PyODBC.

  • Introducing a small sleep between SQL fragment executions just in case there's some sort of race condition. Seemed unlikely to be a solution, and didn't change the behavior.

  • I've considered separating each batch of statements out into a separate transaction that is committed before the next batch is executed, but that would reduce/eliminate our ability to automatically roll back a schema migration that fails.

  • EDIT: I just found this question, which is pretty much exactly what I want to do. However, upon testing (in SSMS) the answer that recommends using EXEC I discovered that the second EXEC command (setting the value) fails because it isn't aware of the new foreign key. I'm bad at testing and it actually does succeed. This solution might work but isn't ideal since EXEC isn't compatible with parameters. Also, this won't work if variables are used across fragments.

     BEGIN TRAN
         EXEC('ALTER TABLE MyTable ADD NewForeignKeyID INT NULL FOREIGN KEY REFERENCES MyParentTable(ID)')
         EXEC('UPDATE MyTable SET NewForeignKeyID = 1')
     ROLLBACK TRAN
    
     Invalid column name 'FK_TestID'.
    
ErikusMaximus
  • 1,150
  • 3
  • 13
  • 28
  • 1
    I just tried your `BEGIN TRAN / EXEC / …` block in SSMS and it worked fine for me with SQL Server 2017 and the default transaction isolation ("READ COMMITTED", IIRC). The column is added and the UPDATE is successful when the transaction is committed (instead of rolled back). – Gord Thompson Sep 02 '21 at 19:32
  • Is it actually splitting it correctly? You have `re.split(r'^\s*GO;?\s*$'` shouldn't that be `re.split(r'\s*GO;?\s*'` otherwise it will only split a maximum of once because there is only one beginning and end of text – Charlieface Sep 02 '21 at 19:41
  • @GordThompson I just re-tested my `EXEC` block and that is working now. I must have typo'ed something earlier. `EXEC` isn't an ideal solution since it isn't compatible with parameters, but might work if that's my only option. I also noticed that in my example code I'm executing the whole `sql` rather than `sql_frag`, so I'll fix that. @Charlieface good question but yes it is splitting correctly. Check out the docs for [re.MULTILINE](https://docs.python.org/3/library/re.html#re.MULTILINE). – ErikusMaximus Sep 02 '21 at 21:09
  • Also, the `EXEC` approach isn't compatible with variables that are used across multiple fragments. – ErikusMaximus Sep 02 '21 at 21:27
  • 1
    If your Python script is reading the SQL code (including `GO` keywords) from a text file then you might consider just shelling out and running the `sqlcmd` utility with that text file as input. – Gord Thompson Sep 04 '21 at 22:10
  • I tested `sqlcmd` and it does seem to meet my needs. @GordThompson can you write an answer recommending `sqlcmd`? Ultimately I'd prefer a solution that supports PyODBC, but If I'm not able to find such a solution over the next week then I'd accept the `sqlcmd` answer. – ErikusMaximus Sep 07 '21 at 18:55

1 Answers1

1

If you are reading the SQL statements from a text file (such as one produced by scripting objects in SSMS) then you could just use Python's subprocess module to run the sqlcmd utility with that file as the input (-i). In its simplest form that would look like

server = "localhost"
port = 49242
uid = "scott"
pwd = "tiger^5HHH"
database = "myDb"
script_file = r"C:\__tmp\batch_test.sql"
"""contents of the above file:
DROP TABLE IF EXISTS so69020084;
CREATE TABLE so69020084 (src varchar(10), var_value varchar(10));
INSERT INTO so69020084 (src, var_value) VALUES ('1st batch', 'foo');
GO
INSERT INTO so69020084 (src, var_value) VALUES ('2nd batch', 'bar');
GO
"""

import subprocess

cmd = [
    "sqlcmd",
    "-S", f"{server},{port}",
    "-U", uid,
    "-P", pwd,
    "-d", database,
    "-i", script_file,
]
subprocess.run(cmd)
Gord Thompson
  • 116,920
  • 32
  • 215
  • 418
  • 1
    Microsoft also supports `sqlcmd` on [unix hosts](https://learn.microsoft.com/en-us/sql/linux/sql-server-linux-setup-tools?view=sql-server-ver15). – ErikusMaximus Sep 07 '21 at 21:31