-1

I would like to identify the number of null values in each column in all tables.I have a data base it consist of around 250 tables.Most of them are in use.The problem is almost all tables contain unwanted columns which created for some short term use.Now we want to identify columns with null values for all tables.Since the count of table is large and time is less.I would like to know an easiest way to identify null record count on each table in column wise.

I tried this query which i got from internet.But in this i have to give each table name manually.

DECLARE @cols1 NVARCHAR(MAX);
DECLARE @sql NVARCHAR(MAX);

SELECT @cols1 = STUFF((
    SELECT ', COUNT(CASE WHEN ISNULL(CONVERT(NVARCHAR(MAX), [' + t1.NAME + ']),'''' ) = '''' THEN 1 END) AS ' + t1.name
    FROM sys.columns AS t1
    WHERE t1.object_id = OBJECT_ID('Area')
    -- ORDER BY ', COUNT([' + t1.name + ']) AS ' + t1.name
    FOR XML PATH('')
), 1, 2, '');

SET @sql = '
SELECT ' + @cols1 + '
FROM Area
'
EXEC(@sql)

Please help me get a improved query get Result.

Thanku

Eldho Baby M
  • 115
  • 1
  • 3
  • 13

2 Answers2

3

This is a mess, but it works:

DECLARE @SQL nvarchar(MAX),
        @CRLF nchar(2) = NCHAR(13) + NCHAR(10);

CREATE TABLE #NullCounts (SchemaName sysname,
                          TableName sysname,
                          ColumnName sysname,
                          NULLCount bigint);

DECLARE @Delimiter nchar(3) = ',' +@CRLF;

SET @SQL = STUFF((SELECT @CRLF + @CRLF +
                         N'WITH Counts AS(' + @CRLF +
                         N'    SELECT N' + QUOTENAME(s.[name],'''') +N' AS SchemaName,' + @CRLF +
                         N'           N' + QUOTENAME(t.[name],'''') +N' AS TableName,' + @CRLF +
                         STRING_AGG(N'           COUNT_BIG(CASE WHEN ' + QUOTENAME(c.[name]) + N' IS NULL THEN 1 END) AS ' + QUOTENAME(c.[name]),@Delimiter) WITHIN GROUP(ORDER BY c.column_id) + @CRLF +
                         N'    FROM ' + QUOTENAME(s.[name]) + N'.' + QUOTENAME(t.[name]) + N' T)' + @CRLF +
                         N'INSERT INTO #NullCounts(SchemaName, TableName, ColumnName, NULLCount)' + @CRLF +
                         N'SELECT SchemaName,' + @CRLF +
                         N'       TableName,' + @CRLF +
                         N'       V.ColumnName,' + @CRLF +
                         N'       V.NULLCount' + @CRLF +
                         N'FROM Counts C' + @CRLF +
                         N'     CROSS APPLY (VALUES' +
                         STUFF(STRING_AGG(N'                        (N' + QUOTENAME(c.[name], '''') + N', C.' + QUOTENAME(c.[name]) + N')',@Delimiter) WITHIN GROUP (ORDER BY c.column_id),1,24,N'') + N')V(ColumnName,NULLCount);'
                  FROM sys.schemas s
                       JOIN sys.tables t ON s.schema_id = t.schema_id
                       JOIN sys.columns c ON t.object_id = c.object_id
                  GROUP BY s.[name], t.[name]
                  FOR XML PATH(N''),TYPE).value('.','nvarchar(MAX)'),1,4,N'');
     
--PRINT @SQL; --This is gunna be way longer than 4,000 characters, so you'll want SELECT

EXEC sys.sp_executesql @SQL;

GO

SELECT *
FROM #NullCounts
ORDER BY SchemaName,
         TableName,
         ColumnName;

GO

DROP TABLE #NullCounts;

Yes, I mix STRING_AGG and FOR XML PATH, yes it's an eyesore, but the printed (selected) SQL produces some very nice statements. See below:

WITH Counts AS(
    SELECT N'dbo' AS SchemaName,
           N'PerformanceTest' AS TableName,
           COUNT_BIG(CASE WHEN TestID IS NULL THEN 1 END) AS [TestID],
           COUNT_BIG(CASE WHEN TestTarget IS NULL THEN 1 END) AS [TestTarget],
           COUNT_BIG(CASE WHEN TestName IS NULL THEN 1 END) AS [TestName],
           COUNT_BIG(CASE WHEN TimeStart IS NULL THEN 1 END) AS [TimeStart],
           COUNT_BIG(CASE WHEN TimeEnd IS NULL THEN 1 END) AS [TimeEnd],
           COUNT_BIG(CASE WHEN TimeTaken_ms IS NULL THEN 1 END) AS [TimeTaken_ms],
           COUNT_BIG(CASE WHEN TotalRows IS NULL THEN 1 END) AS [TotalRows],
           COUNT_BIG(CASE WHEN RowSets IS NULL THEN 1 END) AS [RowSets],
           COUNT_BIG(CASE WHEN AvgRowsPerSet IS NULL THEN 1 END) AS [AvgRowsPerSet]
    FROM [dbo].[PerformanceTest] T)
INSERT INTO #NullCounts(SchemaName, TableName, ColumnName, NULLCount)
SELECT SchemaName,
       TableName,
       V.ColumnName,
       V.NULLCount
FROM Counts C
     CROSS APPLY (VALUES(N'TestID', C.[TestID]),
                        (N'TestTarget', C.[TestTarget]),
                        (N'TestName', C.[TestName]),
                        (N'TimeStart', C.[TimeStart]),
                        (N'TimeEnd', C.[TimeEnd]),
                        (N'TimeTaken_ms', C.[TimeTaken_ms]),
                        (N'TotalRows', C.[TotalRows]),
                        (N'RowSets', C.[RowSets]),
                        (N'AvgRowsPerSet', C.[AvgRowsPerSet]))V(ColumnName,NULLCount);

WITH Counts AS(
    SELECT N'dbo' AS SchemaName,
           N'someTable' AS TableName,
           COUNT_BIG(CASE WHEN id IS NULL THEN 1 END) AS [id],
           COUNT_BIG(CASE WHEN SomeCol IS NULL THEN 1 END) AS [SomeCol]
    FROM [dbo].[someTable] T)
INSERT INTO #NullCounts(SchemaName, TableName, ColumnName, NULLCount)
SELECT SchemaName,
       TableName,
       V.ColumnName,
       V.NULLCount
FROM Counts C
     CROSS APPLY (VALUES(N'id', C.[id]),
                        (N'SomeCol', C.[SomeCol]))V(ColumnName,NULLCount);

And yes, I really spent the last 45 minutes writing all that...

Honestly, this is not entry level, and if you don't understand it, you shouldn't be using it; but also, you I very much doubt you'll find a different solution that is entry level and that is as performant as this. A CURSOR, for example, although probably easier to understand would be really slow doing this.

Caveat: If you have any deprecated data types in your database (i.e. text) this will fail. If that is the case, you will need to ensure you eliminate them from the query in the WHERE. However, I suggest that you fix your data types (text, for example, has been deprecated for 15 years).

Thom A
  • 88,727
  • 11
  • 45
  • 75
0

For tabular results:

declare @sql  nvarchar(max) = 
(
select 'union all select (select object_schema_name('+ cast(tableobjectid as varchar(20))+') +''.''+ object_name('+ cast(tableobjectid as varchar(20))+') as "table/@name", count(*) as "table/@rowcount", ' + cols_concat + ' from ' + tablename + ' for xml path(''''), type)' as 'text()'
from
(
select 
    t.object_id as tableobjectid,
    quotename(schema_name(t.schema_id)) + '.' + quotename(t.name) as tablename,
    stuff( (select ', col_name(' + cast(c.object_id as varchar(20)) + ',' + cast(c.column_id as varchar(20)) + ') as "table/col/@name", count(*)-count('+case when type_name(c.system_type_id) in ('text', 'ntext', 'image') then ' case when ' + quotename(c.name) + ' is not null then 1 end' else quotename(c.name) end + ') as "table/col", null as "table"'
            from sys.columns as c where c.object_id = t.object_id for xml path(''), type).value('.', 'nvarchar(max)'), 1, 1, '') as cols_concat
from sys.tables as t
where t.is_ms_shipped = 0
) as tbl
for xml path(''), type).value('.', 'nvarchar(max)')
;

select @sql = 'select @xml = (select * from (select cast(null as xml) as "*"  '+ @sql + ') as u for xml path('''') )';
declare @x xml;

set transaction isolation level read uncommitted;
exec sp_executesql @sql, N'@xml xml output', @xml = @x output;
set transaction isolation level read committed;

--shred 
select 
    t.col.value('@name[1]', 'varchar(200)') as tablename,
    t.col.value('@rowcount[1]', 'int') as tablerowcount,    
    r.col.value('@name[1]', 'varchar(200)') as columnname,  
    r.col.value('.[1]', 'int') as null_values
from @x.nodes('table') as t(col)
cross apply t.col.nodes('col') as r(col);

For multiple resultsets (one per table):

  declare @sql  nvarchar(max) = 
    (
    select 'select object_schema_name('+ cast(tableobjectid as varchar(20))+') +''.''+ object_name('+ cast(tableobjectid as varchar(20))+') as tablename, ' + cols_concat + ' from ' + tablename + ';' as 'text()'
    from
    (
    select 
        t.object_id as tableobjectid,
        quotename(schema_name(t.schema_id)) + '.' + quotename(t.name) as tablename,
        stuff( (select ', count(*)-count('+case when type_name(c.system_type_id) in ('text', 'ntext', 'image') then ' case when ' + quotename(c.name) + ' is not null then 1 end' else quotename(c.name) end + ') as "NULL_' + replace(quotename(name), '"', '""')+'"'  from sys.columns as c where c.object_id = t.object_id for xml path(''), type).value('.', 'nvarchar(max)'), 1, 1, '') as cols_concat
    from sys.tables as t
    where t.is_ms_shipped = 0
    ) as tbl
    for xml path(''), type).value('.', 'nvarchar(max)');

    set transaction isolation level read uncommitted;
    exec(@sql);    
    set transaction isolation level read committed;
lptr
  • 1
  • 2
  • 6
  • 16
  • Why quote some values, and not the others, out of interest? – Thom A Feb 12 '20 at 13:47
  • @Larnu `select '"NULL_' + quotename(replicate('a', 128))+'"', quotename('NULL_' + replicate('a', 128))` – lptr Feb 12 '20 at 13:51
  • @Iptr that's quoted, yes, but not all of them: `select 'select '''+tablename+''' as tablename...'` – Thom A Feb 12 '20 at 13:51
  • quotename() wouldnt provide any safety when concatenating '''+tablename+' if the actual table name has a single quote.. in fact there is no need to concatenate strings, just get the table name by using the object_id ... – lptr Feb 12 '20 at 14:12
  • *"quotename() wouldnt provide any safety when concatenating"* yes it would. What would happen if a table's name had a `'` it in? Blindly injecting strings is why injection is a problem... – Thom A Feb 12 '20 at 14:15
  • Also `"NULL_' + quotename(name)+'"'` would fail if it has a `"` in it. – Thom A Feb 12 '20 at 14:17
  • quotename() does not escape quotes: create table dbo.[abc ';select 1 as test; --] (id int); declare @sql nvarchar(max) = (select 'select ''' + quotename(name) + ''' as tablename' from sys.tables where name = 'abc '';select 1 as test; --' ); exec(@sql); drop table dbo.[abc ';select 1 as test; --]; – lptr Feb 12 '20 at 14:30
  • *"quotename() does not escape quotes"* that isn't true. You can see this in my answer, but also try `QUOTENAME('abc''def''g','''')`. `QUOTENAME` takes 2 parameters, not 1. if the second is omitted, then it defaults to `'[]'`). See [Arguements](https://learn.microsoft.com/en-us/sql/t-sql/functions/quotename-transact-sql?view=sql-server-ver15#arguments) for a full list. – Thom A Feb 12 '20 at 14:34