1

I wrote the following query:

IF OBJECT_ID ('tempdb..#ColumnsType') IS NOT NULL DROP TABLE #ColumnsType
DECLARE @vQuery NVARCHAR(MAX) =''

IF OBJECT_ID ('tempdb..#random') IS NOT NULL DROP TABLE #random

CREATE TABLE #random (
ColumnID INT PRIMARY KEY IDENTITY(1,1) NOT NULL
, randomname VARCHAR(50)
, randomvalue INT)


INSERT INTO #random (randomname, randomvalue)
VALUES ('a3', 123)
    , ('bla', 4325)
    , ('another_bla', 5643)
    , ('end_here', 3)

select *
from #random

CREATE TABLE #ColumnsType (
                ColumnID INT PRIMARY KEY IDENTITY(1,1) NOT NULL
                , ColumnName sysname 
                , DataType sysname
                )

 
INSERT INTO #ColumnsType (ColumnName, DataType)
SELECT [name],
        system_type_id    
FROM Tempdb.Sys.Columns
WHERE Object_ID = Object_ID('tempdb..#random')
AND system_type_id = 56



DECLARE @i INT = (SELECT MIN(ColumnID) FROM #random);
DECLARE @maxId INT = (SELECT MAX(ColumnID) FROM #random);
DECLARE @ColumnName VARCHAR(200);
DECLARE @DataType VARCHAR(200);

WHILE @i <= @maxId
BEGIN
    SET @ColumnName = (SELECT ColumnName FROM #ColumnsType WHERE ColumnId = @i)

    -- SET @DataType = (SELECT DataType FROM #ColumnsType WHERE ColumnId = @i)

    SELECT @vQuery =
    'SELECT 

            MIN(TRY_CONVERT(NUMERIC(30, 4), ' +@ColumnName+ ')) AS ' +@ColumnName+ '_MinValue
            , MAX(TRY_CONVERT(NUMERIC(30, 4), ' +@ColumnName+ ')) AS ' +@ColumnName+ '_MaxValue
            , AVG(TRY_CONVERT(NUMERIC(30, 4), ' +@ColumnName+ ')) AS ' +@ColumnName + '_AvgValue
            , STDEV(TRY_CONVERT(NUMERIC(30, 4), ' +@ColumnName+ ')) AS ' +@ColumnName+ '_StandardDeviation
            , SUM(TRY_CONVERT(NUMERIC(30, 4), ' +@ColumnName+  ')) AS ' +@ColumnName+ '_TotalSum      
    FROM tempdb..#random'   -- +@Schema+'.'+@Table+ ''

    EXEC sp_executesql @vQuery
    PRINT @vQuery

    SET @i = @i + 1
END

For the sake of demonstration I create temp table with random values. I perform profiling on part of the columns which are consisting only of numeric values. To filter the columns I get their names and filter by type, using Tempdb.Sys.Columns. In normal case with my original data, I use INFORMATION_SCHEMA.COLUMNS but I think this is not that important.

The query returns the following:

enter image description here

The result is presented on two rows. What I'd like to do is to have this result on one row. The idea is to pivot the one row result after and to receive the following result:

enter image description here

Dale K
  • 25,246
  • 15
  • 42
  • 71
Yana
  • 785
  • 8
  • 23
  • `UNION ALL` your datasets. Don't run the SQL column by column, make it one statement for *all* your columns. – Thom A Dec 22 '20 at 09:30
  • @Larnu But how can I apply `UNION ALL` to a single dynamic query which returns the results separately? – Yana Dec 22 '20 at 09:32
  • Because you change your dynamic query so that it doesn't *"run the SQL column by column, make it one statement for all your columns."* The problem is you're thinking programmatically and using a loop. – Thom A Dec 22 '20 at 09:34
  • This is what I don't know how to do. And I only came up with this not optimal solution. Could you please show an example of what you mean? – Yana Dec 22 '20 at 09:35

2 Answers2

1

As I mentioned, you need to not use a loop, use a set based method and UNION ALL your dynamic statements. I assume here, as well, that you are using a recent version of SQL Server. If not, you'll need to replace STRING_AGG with the old FOR XML PATH (and STUFF) method.

This should be enough to get you started:

USE Sandbox;
GO

CREATE TABLE dbo.YourTable (Col1 int,
                            Col2 varchar(10));
GO

DECLARE @SchemaName sysname = N'dbo',
        @TableName sysname = N'YourTable';

DECLARE @SQL nvarchar(MAX),
        @CRLF nchar(2) = NCHAR(13) + NCHAR(10);

DECLARE @Delimiter nvarchar(50) = @CRLF + N'UNION ALL' + @CRLF;
        
SELECT @SQL = STRING_AGG(CONVERT(nvarchar(MAX),N'SELECT MIN(') + QUOTENAME(c.[name]) + N') AS ' + QUOTENAME(c.[name] + N'_MIN') + N',' + @CRLF +
                         N'       MAX(' + QUOTENAME(c.[name]) + N') AS ' + QUOTENAME(c.[name] + N'_MAX') + @CRLF + 
                         N'FROM ' + QUOTENAME(s.[name]) + N'.' + QUOTENAME(t.[name])
                         ,@Delimiter) WITHIN GROUP (ORDER BY c.column_id)
FROM sys.schemas s
     JOIN sys.tables t ON s.schema_id = t.schema_id
     JOIN sys.columns c ON t.object_id = c.object_id
WHERE s.[name] = @SchemaName
 AND t.[name] = @TableName

PRINT @SQL;

EXEC sys.sp_executesql @SQL;
GO

DROP TABLE dbo.YourTable;
Thom A
  • 88,727
  • 11
  • 45
  • 75
  • I receive this mistake: STRING_AGG aggregation result exceeded the limit of 8000 bytes. Use LOB types to avoid result truncation. Also, I need to filter only the columns that are numeric. If I do it in the dynamic query statement, it is throwing error, because it is still going through non-numeric columns. – Yana Dec 22 '20 at 10:07
  • *"Also, I need to filter only the columns that are numeric. If I do it in the dynamic query statement, it is throwing error, because it is still going through non-numeric columns."* That wasn't in your question, but it's easy to do. Add another `JOIN` to the `FROM` to `sys.types` and further clause to the `WHERE` then to filter out non-numerical columns. – Thom A Dec 22 '20 at 10:08
  • *I receive this mistake: STRING_AGG aggregation result exceeded the limit of 8000 bytes.* Simple fix, change `'SELECT MIN('` to `CONVERT(nvarchar(MAX),'SELECT MIN(')` – Thom A Dec 22 '20 at 10:09
  • Well, it is not the result I was looking for, which is something like this: `Col1_MIN Col1_MAX Col2_Min Col2_max`, on one row. I receive rows for each column when running your solution. – Yana Dec 22 '20 at 10:30
  • That isn't what your [sample image](https://i.stack.imgur.com/Tp2lV.png) shows, @Yana , it shows you want 2 rows, not 1. – Thom A Dec 22 '20 at 10:32
  • You'll need dynamic dynamic (yes I said that twice) SQL for that, and that is the path to madness, @Yana . – Thom A Dec 22 '20 at 10:33
  • I explain in the description that I want it to be one row. And, I am really sorry if I drive you crazy. It is not my intent to do so. – Yana Dec 22 '20 at 10:47
  • You state *"The idea is to pivot the one row result after and to receive the following result"* So to receive the result in the image, @Yana ... That is what the above does. If that isn't what you're after, then you're question is unclear, and you have (with respect) wasted my time here. – Thom A Dec 22 '20 at 10:53
  • I understand now where the misunderstanding occurred. You have not wasted your time. But I will need time to get what you did with STRING_AGG. It is the first time I see it used, I have never used it. Thank you and I am sorry again. – Yana Dec 22 '20 at 11:03
0

Here is a very clean and nice solution which is what I was looking for:

  • I choose the columns, from specific table, scan it once and perform multiple calculations on it
  • I use dynamic query and make it one statement for all columns.
  • It works really fast. It took a little over 5 minutes to return result for table with 50 mil rows.

The only thing that is left to do is to UNPIVOT in order to insert results in a table that I want.

    DECLARE
    @q1 NVARCHAR(MAX)
,   @q2 NVARCHAR(MAX)
,   @q3 NVARCHAR(500)
,   @schema VARCHAR(50) = '' -- choose schema
,   @table VARCHAR(200) = '' -- choose table

SET @Q1 = 'SELECT ' + '''' + @table + '''' + ' as tableName, '
SET @Q3 = ' FROM ' + @schema + '.' + @table
SELECT @q2 = COALESCE(@q2 + ', ', '') 
+ ' max(' + columnName + ') as ' + columnName + '_max, ' 
+ ' min(' + columnName + ') as ' + columnName + '_min, '
+ ' avg(' + columnName + ') as ' + columnName + '_avg, '
+ ' stdev(' + columnName + ') as ' + columnName + '_stdev, '
+ ' sum(' + columnName + ') as ' + columnName + '_sum '
FROM (
SELECT s.[name] as schemaName, t.[name] as tableName, c.[name] as columnName, st.[name] as typeName 
FROM sys.schemas s
    INNER JOIN sys.tables t ON s.schema_id = t.schema_id
    INNER JOIN sys.columns c ON t.object_id = c.object_id
    INNER JOIN sys.types st ON st.user_type_id = c.user_type_id
WHERE 1=1
AND s.[name] = @schema
AND t.[name] = @table
AND st.[name] IN ('') -- choose columns of specific data type, that you want to profile
 ) data

SELECT @q1 = @q1 + @q2 + @q3

EXEC sys.sp_executesql @Q1 

Enjoy!

Yana
  • 785
  • 8
  • 23