In SQL Server 2022, why is `IS [NOT] NULL` converted to `IS [NOT] DISTINCT FROM NULL` in CHECK and filtered indexes? Is it safe?

Question

As a foreword, I am already familiar with the differences between the IS \[NOT\] NULL and IS \[NOT\] DISTINCT FROM NULL predicates.
- So to be clear, I am not asking what IS DISTINCT FROM NULL does, nor how it differs from other types of comparisons.
I'm running SQL Server 2022 RC0 which was released last week.
I redeployed my SSDT database (currently targeting Azure SQL) to this new local SQL Server 2022 box using SSDT.
- This newly deployed database is configured in SQL Server 2022 with Compatibility Level 160 (i.e. 2022).
- ...though the live/production deployment in Azure SQL currently has Compatibility Level 140 (SQL Server 2017), not that this matters in this specific problem.

My database has plenty of CHECK constraints and filtered INDEX objects. For example, lots of my tables have things like this:

CREATE TABLE dbo.Foobar (
    
    FooId      int          NOT NULL IDENTITY,
    Foo        varchar(100)     NULL,

    CONSTRAINT PK_Foobar PRIMARY KEY ( FooId ),

    CONSTRAINT CK_Foo_is_null_or_trimmed_nonempty CHECK (
        ( SomeNvarcharColumn IS NULL )
        OR
        (
            LEN( SomeNvarcharColumn ) > 0
            AND
            SomeNvarcharColumn NOT LIKE ' _%'
        )
);
GO

CREATE UNIQUE NONCLUSTERED INDEX UX_DistinctFooWhenNotNull ON 
    dbo.Foobar ( Foo )
WHERE
    ( Foo IS NOT NULL );

After doing a Database Schema Comparison in SSDT when publishing to my SQL2022 box, the _{always-broken-in-some-way-or-another} comparison report showed that all of the IS NULL and IS NOT NULL comparisons in all of my CHECK constraints and filtered indexes with IS NOT DISTINCT FROM NULL and IS DISTINCT FROM NULL respectively.

So the above CREATE TABLE now looks like this in SQL Server 2022:

CREATE TABLE dbo.Foobar (
    
    FooId      int          NOT NULL IDENTITY,
    Foo        varchar(100)     NULL,

    CONSTRAINT PK_Foobar PRIMARY KEY ( FooId ),

    CONSTRAINT CK_Foo_is_null_or_trimmed_nonempty CHECK (
        ( SomeNvarcharColumn IS NOT DISTINCT FROM NULL )
        OR
        (
            LEN( SomeNvarcharColumn ) > 0
            AND
            SomeNvarcharColumn NOT LIKE ' _%'
        )
);
GO

CREATE UNIQUE NONCLUSTERED INDEX UX_DistinctFooWhenNotNull ON 
    dbo.Foobar ( Foo )
WHERE
    ( Foo IS DISTINCT FROM NULL );

My problem is that I'm not absolutely certain that x IS NULL is always, 100% equivalent (i.e. identical) to x IS NOT DISTINCT FROM NULL under all conditions (and similarly, that x IS NOT NULL is identical to x IS DISTINCT FROM NULL) - but are they really?

Assuming they is identical, then why does SQL Server 2022 convert these expressions to an (annoyingly) far more verbose form? For what gain or advantage? (Certianly not expressiveness...), and why only in filtered indexes and CHECK constraints but not other expressions?

...and does this introduce any breaking changes?

`x IS NULL` and `x IS NOT DISTINCT FROM NULL` are equivalent expressions. They evaluate to `true` if `X` is null or `false` otherwise. So not sure why you have bolded "they are not equivalent expressions"? Seems annoying though as presumably SSDT will continue thinking they are changed and so generate unnecessary statements to `ALTER` them until you give in and change the version in source control to match the "canonical" form. I've had similar before with `CAST` and `CONVERT` in computed columns. — Martin Smith, Sep 04 '22 at 14:42
As for why it bothers converting these expressions to a canonical form at all my guess would be that this is to assist with matching expressions to the filtered index/computed column/check constraint if they are written in a logically equivalent way but with different text — Martin Smith, Sep 04 '22 at 16:02
Yes they are the same. Probably changes it because the `CHECK` constraint isn't saved verbatim, it's parsed into an expression, and in this case it's storing these two expressions the same way because they are equivalent. There is no reason to store two different versions of the same expression tree just to maintain your version control, it prefers to make a canonical expression which is easire to deal with. — Charlieface, Sep 04 '22 at 16:03
Probably same to how defaults are usually wrapped in () and other nonsense in the actual definition vs. the definition you write, ie. nothing to see here. Plus, remember that filtered indexes are used to match queries, and it likely simplifies the Query planner's job to be able to match the "WHERE"-condition one wrote against the filtered index expression if it's normalized. — siggemannen, Jul 30 '23 at 15:35

eag-ehsan · Answer 1 · 2022-09-04T14:34:07.103

-1

Using this operator "IS [NOT] DISTINCT FROM" when you want to compare two columns that have NULL is very practical and convenient. We know that two columns that contain NULLs can never be the same because NULLs are unknown and can never be equal to anything else, not even another NULL.

Sometimes you may want to treat NULLs as the same: For example, these two codes do exactly the same thing:

... WHERE (Column1 = Column2) or (Column1 IS NULL AND Column2 IS NULL)

and

... WHERE Column1 IS NOT DISTINCT FROM Column2

this syntax was added to SQL DB2 in version 8.

edited Sep 04 '22 at 14:34

answered Sep 04 '22 at 14:30

eag-ehsan

35
5

1

This does not answer my question, sorry (all you're doing is explaining what `IS DISTINCT FROM` does, and I already know that - I'm asking _why_ SQL Server 2022 seems to special-case it only in CHECK constraints and filtered indexes) - and I'm asking about SQL Server 2022, not IBM DB2. – Dai Sep 04 '22 at 14:31

In SQL Server 2022, why is `IS [NOT] NULL` converted to `IS [NOT] DISTINCT FROM NULL` in CHECK and filtered indexes? Is it safe?

1 Answers1