Rough definitions, for this answer:
- Simple constraint. A constraint that can be, without schema changes beyond the constraint definition, enforced by the straight forward use of one of SQL Server constraints: primary key, unique, foreign key, not null and check against a simple scalar expression. (I would include a unique filtered index as a simple constraint.)
- Complex constraint. A constraint that can not be so enforced.
Just as SQL server provides different constraints to handle different situations there are different techniques for enforcing more complex constraints. Which technique is appropriate depends on what one is trying to accomplish and the context one is in.
Broadly, there are two approaches to enforcing complex constraints, either add code to that prevents bad data or change the schema in such a way that SQL Server can enforce the constraint with a simple constraint.
Add code to enforce the constraint.
A common refrain is to use a trigger to enforce a constraint. So one writes an after trigger that checks whether the constraint has been violated by the DML that fired the trigger. If it has been violated the trigger throws an error, rolling back the transaction. Another approach is to take insert, update, delete rights away from all applications and force them through stored procedures that will error on attempts to violate the constraint. Using a UDF in a check constraint is new to me. The mechanism is similar to the after trigger, a DML statement made a change to the data. After that change, but before the DML statement is done, code is ran in the UDF and the result checked to see if the constraint is violated or not.
The big challenge with adding code that checks the constraint is correctly enforcing the constraint in the face of concurrency. Because transactions are isolated, the code verifying that the constraint is met may not see a conflicting change made by another session. See an example "Concurrency can cause trouble" at the end.
Now, some business rules are too complex to enforce with changes to the schema and code is required. I generally don't consider such business rules to be constraints that the database must enforce and place the code to enforce the business rules in either stored procedures or application code, with the other business rules. (Saying that these business rules are not database constraints does not getting correctness in the face of concurrency any easier.)
Change the schema so SQL Server can enforce the constraint.
A strength reduction from a complex constraint to a simple constraint. There is no one technique in this approach. Sometimes the application of a schema change that will allow SQL Server to enforce a constraint is easy to spot. And sometimes one may need to play with SQL Server features with outside the box thinking.Some schema changes that may be helpful:
- Increase the normalization. Many reasonable constraints cannot be expressed against denomalized data.
- Adding a persisted computed column to a table and using that column in a constraint.
- Indexed views can be amazing for constraint enforcement. (The restrictions placed on what views can be indexed is super frustrating.)
Example using another table
(This may count as a normalization.) Lets consider a table, like in your question, but adding one additional column, a primary key that allows update and delete of individual rows.
CREATE TABLE dbo.T (PK INT PRIMARY KEY IDENTITY(1, 1)
, A CHAR(1) NOT NULL
, B INT NOT NULL
/* other columns as needed */);
And the rule is that the following query should never return a row:
SELECT 'Uh oh!' AS "Uh oh!"
FROM dbo.T
GROUP BY A
HAVING COUNT(DISTINCT B) > 1
To enforce this one can add the following table and constraint:
CREATE TABLE dbo.TConstraint (A CHAR(1) NOT NULL PRIMARY KEY /* Only one B is allowed for any A */
, B INT NOT NULL
, UNIQUE (A, B) /* provide a target for a foreign key from dbo.T */);
ALTER TABLE dbo.T ADD FOREIGN KEY (A, B)
REFERENCES dbo.TConstraint (A, B);
Now the application logic or stored procedures that write to dbo.T
will have to change to write to dbo.EnforceComplexConstraintOnT
first. Updates to dbo.T (B)
are going to be messy because one will need to change the values in dbo.TCnstraint
before dbo.T
but that can't be done with the old values in dbo.T
. So an update becomes:
- Delete row from
dbo.T
- Update
dbo.TConstraint
- Insert "changed" row into
dbo.T
.
That update logic is incompatible with an IDENTITY column and incompatible with any foreign key referencing dbo.T
. So the contexts where this would work are limited.
Note also that this solution includes adding additional code for insert, update, and possibly delete. That extra code is not enforcing the constraint, but manipulating the data so that into a form such that SQL Server can enforce the constraint.
Example using indexed views
Indexed views come with caveats. See the documentation: https://learn.microsoft.com/en-us/sql/relational-databases/views/create-indexed-views?view=sql-server-ver15
Instead of the new table and constraint:
/* SET OPTIONS required by indexed views*/
SET ANSI_NULLS, ANSI_PADDING, ANSI_WARNINGS
, CONCAT_NULL_YIELDS_NULL, QUOTED_IDENTIFIER ON;
SET NUMERIC_ROUNDABORT OFF;
GO
/* view shows all the distiinct A, B in dbo.T */
CREATE VIEW dbo.TConstraint WITH SCHEMABINDING AS
SELECT A
, B
, COUNT_BIG (*) AS MustHaveForGroupingInAnIndexedView
FROM dbo.T
GROUP BY A, B;
GO
CREATE UNIQUE CLUSTERED INDEX A_B ON dbo.TConstraint (A, B);
CREATE UNIQUE NONCLUSTERED INDEX A ON dbo.TConstraint (A);
INSERT INTO dbo.T (A, B)
VALUES ('a', 100), ('a', 200)
Msg 2601, Level 14, State 1, Line 27
Cannot insert duplicate key row in object 'dbo.TConstraint' with unique index 'A'. The duplicate key value is (a).
Pros: 1) No extra code to get the data into a form SQL Server can constrain. 2) Updates to dbo.T (B)
can be done without doing the delete dbo.T
update dbo.TConstraint
insert dbo.T
dance, and all the restrictions that creates. Cons 1) The view has to have schemabinding
. 2) Deploying schema changes to dbo.T
will involve dropping the indexes on the view, dropping the view, altering dbo.T
, create the view, create the indexes on the view. Some deployment tools may struggle with the required steps. Large tables with large indexed views will have a longer deployment for changes to dbo.T
.
Concurrency can cause trouble.
@Dai's answer will work in at least some cases, but fails in the following case.
Setup:
- Create
dbo.TableX
, dbo.GetCountInvalidGroupsInTableX()
, and CK_Validity
as detailed in Dai's answer.
- Enable snapshot isolation for the database.
ALTER DATABASE CURRENT SET ALLOW_SNAPSHOT_ISOLATION ON
- Connect to the database with two separate sessions.
In session 1:
/* Read committed was the default in my connection */
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN TRANSACTION;
INSERT INTO TableX VALUES(10, 100);
In session 2:
SET TRANSACTION ISOLATION LEVEL SNAPSHOT
INSERT INTO TableX VALUES(10, 200);
/* (1 row affected) This insert did not block
on the open transaction */
In session 1:
COMMIT;
SELECT * FROM TableX;
/*
A B
10 100
10 200
*/
To add code to enforce a constraint one must know what isolation level may be used and ensure that the DML will block when needed so that when the code checking that the constraint is still satisfied can see everything it needs to.
Summary
Just as SQL Server has different constraint types for different situations there are different techniques for enforcing constraints when using what SQL provides against the current data model is not enough. Prefer schema changes that allows SQL Server to enforce the constraints. SQL Server is far more likely to get concurrency correct that most people writing Transact-SQL.