What is the difference between the below queries & how it works?
SELECT * FROM some_table WHERE col IS NOT NULL
&
SELECT * FROM some_table WHERE col <> ''
Regards, Mubarak
What is the difference between the below queries & how it works?
SELECT * FROM some_table WHERE col IS NOT NULL
&
SELECT * FROM some_table WHERE col <> ''
Regards, Mubarak
The NULL
is special data type, it means absence of value.
An empty string on the other hand means a string or value which is empty.
Both are different.
For example, if you have name
field in table and by default you have set it to NULL
. When no value is specified for it, it will be NULL
but if you specify a real name or an empty string, it won't be NULL
then, it will contain an empty string instead.
NULL
is the absence of value, and usually indicates something meaningful, such as unknown or not (yet) determined. For example, if I start a project today, the StartDate
is 2012-02-25
. If I don't know how long the project is going to take, what should the EndDate
be? I might have some idea what the ProjectedEndDate
may be, but I would set the EndDate
to NULL
, and update it when the project is complete.
''
is a zero-length (or "empty") string. It is not technically the absence of data, since it might actually be meaningful. For example, if I don't have a middle name, depending on your data model ''
might make more sense than NULL
since the latter implies unknown but ''
can imply that it is known that I don't have one. NULL
can be used the same way of course, but then it is difficult to decipher whether it is not known whether it exists, or known that it does not exist. A lot of standards have dedicated values for things where it might not be known - for example Gender
has I believe 9 different character codes so that if M
or F
are not specified, we always know exactly why (unknown, unspecified, transgender, etc). Also think of the case where HeartRate
is NULL
- is it because there was no pulse, or because we haven't taken it yet?
They are not the same, though unfortunately many people treat them the same. If your column allows NULL
it means that you know in advance that sometimes you may not know this information. If you are not treating them as the same thing, then your queries would differ. For example if col
does not allow NULL
, your first query will always return all results in the table, since none of them can be NULL
. However NOT NULL
still allows an empty string to be entered unless you have also set up a check constraint to prevent zero-length strings also.
Allowing both for the same column is usually a bit confusing for someone trying to understand the data model, though I believe in most cases a NOT NULL
constraint is not combined with a LEN(col)>0
check constraint. The problem if both are allowed is that it is difficult to know what it means if the column is NULL
or the column is "empty" - they could mean the same thing, but they may not - and this will vary from shop to shop.
Another key point is that NULL
compared to anything (at least by default in SQL Server*
) evaluates to unknown, which in turn evaluates to false. As an example, these queries all return 0
:
DECLARE @x TABLE(i INT);
INSERT @x VALUES(NULL);
SELECT COUNT(*) FROM @x WHERE i = 1;
SELECT COUNT(*) FROM @x WHERE i <> 1;
SELECT COUNT(*) FROM @x WHERE i <= 3;
SELECT COUNT(*) FROM @x WHERE i > 3;
SELECT COUNT(*) FROM @x WHERE i IN (1,2,3);
SELECT COUNT(*) FROM @x WHERE i NOT IN (1,2,3);
Since the comparisons in the where clause always evaluate to unknown, they always come back false, so no rows ever meet the criteria and all counts come back as 0
.
In addition, the answers to this question on dba.stackexchange might be useful:
https://dba.stackexchange.com/questions/5222/why-shouldnt-we-allow-nulls
*
You can change this by using SET ANSI_NULLS OFF
- however this is not advised both because it provides non-standard behavior and because this "feature" has been deprecated since SQL Server 2005 and will become a no-op in a future version of SQL Server. But you can play with the query above and see that the NOT IN
behaves differently with SET ANSI_NULLS OFF
.
NULL means the value is missing but '' means the value is there but just empty string so first query means query all rows that col value is not missing, second one means select those rows that col not equals empty string
Update
For further information, I suggest you read this article:
https://sqlserverfast.com/blog/hugo/2007/07/null-the-databases-black-hole/
Select * from table where col IS NOT NULL
would return results excluded from Select * from table where col <> ‘’
because an empty string is also NOT NULL.
SET NOCOUNT ON;
DECLARE @tbl AS TABLE (value varchar(50) NULL, description varchar(50) NOT NULL);
INSERT INTO @tbl VALUES (NULL, 'A Null'), ('', 'Empty String'), ('Some Text', 'A non-empty string');
SELECT * FROM @tbl;
SELECT * FROM @tbl WHERE value IS NOT NULL;
SELECT * FROM @tbl WHERE value <> '';
Note that in the display you cannot distinguish between NULL and '' - this is only an artifact of how the grid and text client display the data, but the data in the set is stored differently for NULL and ''.
As stated in other answers, NULL
means 'no value' while empty string ''
means just that - empty string. You can think of fields that allow NULLs
as optional fields - they can be ignored and value for them may just not be provided.
Imagine an application where respondent selects their title (Mr, Mrs, Miss, Dr) but you do not require him/her to select any of those and leave it blank. In this case you would put NULL into relevant database field.
Distinction between NULL
and empty string may not be obvious because they both can mean 'no value' if you decide to. It depends entirely up to you but using NULL
would be better mainly because it is a special case for databases which are designed to handle NULLs quickly and efficiently (much faster than strings). If you use it instead of an empty string your queries will be faster and more reliable.