41

Why do both of the following return zero? Surely the second is a negation of the first? I am using SQL Server 2008.

DECLARE 
    @a VARCHAR(10) = NULL ,
    @b VARCHAR(10) = 'a'

SELECT  
    CASE WHEN ( ( @a IS NULL
                      AND @b IS NULL
                    )
                    OR @a = @b
                  ) THEN 1
         ELSE 0
    END , -- Returns 0
    CASE WHEN NOT ( ( @a IS NULL
                      AND @b IS NULL
                    )
                    OR @a = @b
                  ) THEN 1
         ELSE 0
    END -- Also returns 0
Matt
  • 25,467
  • 18
  • 120
  • 187
ChrisN
  • 530
  • 5
  • 13
  • 4
    Put a `set ansi_nulls off;` in front of your query and you'll see the difference. Then you don't get a `NULL` from comparisons where at least one operarand is `NULL` but a `false`. – Tim Schmelter Sep 13 '16 at 13:31
  • 1
    @Luaan's answer is 100% correct. However, you might be looking for the `IS DISTICT FROM` operator (`<=>` in MySQL). – Owen Sep 13 '16 at 16:08

4 Answers4

50

It is a negation. However, you need to understand ANSI NULLs - a negation of a NULL is also a NULL. And NULL is a falsy truth value.

Therefore, if any of your arguments is null, the result of @a = @b will be null (falsy), and a negation of that will also be a null (falsy).

To use negation the way you want, you need to get rid of the NULL. However, it might be easier to simply reverse the results of the comparison instead:

case when (...) then 1 else 0 end,
case when (...) then 0 else 1 end

Which will always give you either 1, 0 or 0, 1.

EDIT:

As jpmc26 noted, it might be useful to expand a bit on how nulls behave so that you don't get the idea that a single NULL will make everything NULL. There are operators which do not always return null when one of their arguments is null - the most obvious example being is null, of course.

In a more broad example, logical operators in T-SQL use Kleene's algebra (or something similar), which defines the truth values of an OR expression like so:

  | T | U | F
T | T | T | T
U | T | U | U
F | T | U | F

(AND is analogous, as are the other operators)

So you can see that if at least one of the arguments is true, the result will also be true, even if the other is an unknown ("null"). Which also means that not(T or U) will give you a falsy truth value, while not(F or U) will also give you a falsy truth value, despite F or U being falsy - since F or U is U, and not(U) is also U, which is falsy.

This is important to explain why your expression works the way you expect it to when both arguments are null - the @a is null and @b is null evaluates to true, and true or unknown evaluates to true.

Luaan
  • 62,244
  • 7
  • 97
  • 116
  • 1
    Apparently versions of SQL Server later than 2008 will NOT allow you to set ansi nulls off. That is, if you rebuild your stored procedures these versions force them to have the "correct" behavior, and woe unto you if you have legacy code that depends on the incorrect behavior. You may have a lot of re-writing to do. Here is a good blog post on those evil ansi nulls: http://www.sqlservercentral.com/blogs/sqlstudies/2014/07/28/what-is-ansi_nulls-and-why-will-i-be-glad-when-it-finally-goes-away/ – MickeyfAgain_BeforeExitOfSO Sep 13 '16 at 18:36
  • what does "falsy truth value" mean? – haku Sep 13 '16 at 21:37
  • Doesn't the `NULL` only propagate outside the `OR` condition if `@a IS NULL AND @b IS NULL` is false? – jpmc26 Sep 13 '16 at 22:19
  • 2
    @NoSaidTheCompiler It means a truth value that behaves like false. – user253751 Sep 13 '16 at 22:43
  • 1
    @mickeyf The thing is, ANSI NULLs are very useful. It just means you need to understand how NULLs work - NULL is unknown, not zero. `0 + NULL` doesn't make sense, and making it cause an error is worse than the supposed cure. The main problem is if you take all the assumptions of your main language, and assume they will work the same in SQL (and all variants of SQL). For example, T-SQL doesn't have a boolean type at all - you can't declare a truthy variable. This causes much confusion, especially when people think that `bit` is a boolean. – Luaan Sep 14 '16 at 07:57
  • @jpmc26 Yes, but note that it doesn't catch when one of the two is NULL and the other isn't. Which still means you're comparing something to NULL, and the result of that is NULL. – Luaan Sep 14 '16 at 08:00
  • @Luaan I'm aware. `true OR NULL` is just `true`, because the "unknown" doesn't matter. `false OR NULL` is `NULL` because the `false` doesn't matter. But this answer doesn't make that important fact clear. – jpmc26 Sep 14 '16 at 13:26
  • 1
    @jpmc26 Oh, I see what you're pointing at now. I didn't say that *all* operators propagate `null`s (that would make `is null` very much impossible, to say the least), but I expanded the answer to cover this. – Luaan Sep 14 '16 at 14:49
  • @luann I don't disagree with you at all. It has caused me grief however, the MS for years had a (incorrect, to be sure) default, then changed so that many years of legacy code could no longer be compiled as is (even by explicitly and consciously choosing a non-recommended setting) but needed to be entirely re-written. Sometimes having the option to leave well enough alone is appropriate. – MickeyfAgain_BeforeExitOfSO Sep 14 '16 at 16:17
  • @mickeyf Well, the first MS SQL server I used was 2000, and that already had (SQL-92 compatible) ANSI NULLs by default. And as far as I can see, the SQL-92 support is even older than that. Maybe you used a client that didn't use ANSI NULLs? It was never the decision of the *server*. And you are mistaken, the option is still there in 2012; the disclaimer just says that it may be removed sometime in the future and you shouldn't rely on it. And it will cause an error, rather than silently ignoring the flag, so you know you have fixing to do. – Luaan Sep 15 '16 at 08:30
7

This 'weird' behavior that you are encountering is caused by the NULL values.

The negation of NOT (Something that returns NULL) is not TRUE , it's still NULL .

E.G.

SELECT * FROM <Table> WHERE <Column> = null -- 0 rows 
SELECT * FROM <Table> WHERE NOT (<Column> = null) -- Still 0 rows

In addition to what been said here, you can avoid that behavior by using

SET ANSI_NULLS OFF

Which will let the optimizer to treat NULL as normal value, and return TRUE\FALSE . You should note that this is not recommended at all and you should avoid it !

sagi
  • 40,026
  • 6
  • 59
  • 84
4

It is problem with @a=@b if either of this value is null then it will be problem

If you try below code will give correct results

DECLARE 
    @a VARCHAR(10) = NULL ,
    @b VARCHAR(10) = 'a'

SELECT  
    CASE WHEN ( ( @a IS NULL
                      AND @b IS NULL
                    )
                    OR @a = @b
                  ) THEN 1
         ELSE 0
    END , -- returns 0
    CASE WHEN NOT ( ( @a IS NULL
                      AND @b IS NULL
                    )
                    OR ISNULL(@a,-1) = ISNULL(@b,-1)
                  ) THEN 1
         ELSE 0
    END -- also returns 0
Kannan Kandasamy
  • 13,405
  • 3
  • 25
  • 38
  • 2
    Using `ISNULL` makes the `@a IS NULL AND @b IS NULL` check unnecessary. This also *assumes* that the values can never be `-1`, and are you sure you can even put `-1` there when the types are `VARCHAR`? – jpmc26 Sep 13 '16 at 22:22
0

NOT is always a negation. The reason for this behaviour of T-SQL lies in the fact that null values are treated in a special way depending on a database configuration setting (known as ansi_nulls). Depending on this setting, null is either treated in the same way as any other value or it is treated as "value not set". In this case, all expressions containing null values are regarded as invalid.

Furthermore, the expression

(@a IS NULL AND @b IS NULL)
OR 
@a = @b

covers only the case when both variables are NULL, it does not deal with cases when either @a or @b is NULL. If that happens, the result depends on the setting of ansi_nulls: if it is on, then the result of @a = @b is always false if one of the variables is NULL.

If ansi_nulls is off, then NULL is treated as a value and behaves as you expect.

To avoid such unexpected behaviour, you should cover all cases as follows:

DECLARE 
    @a VARCHAR(10) = 'a',
    @b VARCHAR(10) = null

SELECT  
    CASE 
        WHEN (@a IS NOT null AND @b IS null) THEN 0
        WHEN (@a IS null AND @b IS NOT null) THEN 0
        WHEN (@a IS null AND @b IS null) THEN 1
        WHEN (@a=@b) THEN 1
    ELSE 0
    END 

Note that in this example all null cases are dealt with before the @a=@b case is checked (in a CASE statement, the WHEN's are processed in the order as they appear, and if a condition matches, processing is finished and the specified value is returned).


To test all possible (relevant) combinations, you can use this script:

DECLARE @combinations TABLE (
    a VARCHAR(10),b VARCHAR(10)
    )

INSERT INTO @combinations
    SELECT 'a', null 
    UNION SELECT null, 'b'
    UNION SELECT 'a', 'b'
    UNION SELECT null, null
    UNION SELECT 'a', 'a'

SELECT a, b,
    CASE 
        WHEN (a IS NOT null AND b IS null) THEN 0
        WHEN (a IS null AND b IS NOT null) THEN 0
        WHEN (a IS null AND b IS null) THEN 1
        WHEN (a=b) THEN 1
    ELSE 0
    END as result
from @combinations
order by result 

It returns:

result of query

In other words, in this script null is treated as a value, hence a='a' and b=null returns 0, which is what you've expected. Only if both variables are equal (or both null), it returns 1.

Matt
  • 25,467
  • 18
  • 120
  • 187