A query that selects multiple aggregates from a couple million rows shows mysteriously poor performance, orders of magnitude slower than the same query in Postgres, and orders of magnitude slower than calculating each aggregate in its own separate SQL query.
STATISTICS IO reveals an inexplicably high count for logicals reads from 'Worktable', something like 157 million logicals reads when processing 50 million rows (92917 pages read from the source table). The behaviour is exactly the same in all SQL Server versions and editions that I have access to, from SQL Server 2008R2 through SQL Server 2019.
The simplest query exhibiting this behaviour which I could find looks like this:
select count(*), count(distinct m13), count(distinct m23) from foo
m13
and m23
are low-cardinality foreign keys, which are simulated by taking the modulo of the record number in the following repro script (currently configured for processing 50% of 10,000,000 rows):
-- note: modify the cross join in the CTE for X in order to achieve the desired table size
if object_id('tempdb..#T') is null begin
create table #T (n int primary key clustered, m13 tinyint not null, m23 tinyint not null);
with -- ISNULL() removes the stain of perceived nullability
D as (select isnull(value, 0) as value from (values (0), (1), (2),(3),(4),(5),(6),(7),(8),(9)) v (value)),
H as (select isnull(d1.value * 10 + d0.value, 0) as value from D d1 cross join D d0),
K as (select isnull(h.value * 10 + d.value, 0) as value from H h cross join D d),
M as (select isnull(k1.value * 1000 + k0.value, 0) as value from K k1 cross join K k0),
-- using ROW_NUMBER() here to allow easy composition via cross join, for freely choosing the target table size
X as (select isnull(row_number() over (order by (select null)), 0) as value from D cross join M)
insert into #T
select value, value % 13, value % 23 from X;
end;
-- dial the desired percentage of rows to skip via the factor here
declare @threshold int = (select floor(0.50 * max(n) + 1) from #T);
select compatibility_level, @@version as "@@version" from sys.databases where name = db_name();
select '@threshold = ' + cast(@threshold as varchar);
set statistics time on;
set statistics io on;
select count(*) from #T where n >= @threshold;
select count(*), count(distinct m13) from #T where n >= @threshold;
select count(*), count(distinct m13), count(distinct m23) from #T where n >= @threshold;
-- logical reads from 'Worktable' for @threshold at 50%:
-- 10094 for 1e4 ( 12 pages read from #T)
-- 100942 for 1e5 ( 96 pages read from #T)
-- 1306626 for 1e6 ( 933 pages read from #T)
-- 14891285 for 1e7 ( 9295 pages read from #T)
-- 157450233 for 1e8 (92917 pages read from #T)
print char(13) + '### now the same in piece-meal fashion ... ###';
declare @cnt int, @m13 int, @m23 int = (select count(distinct m23) from #T where n >= @threshold);
select @cnt = count(*), @m13 = count(distinct m13) from #T where n >= @threshold;
select @cnt, @m13, @m23;
set statistics io off;
set statistics time off;
Is there a way to goad SQL Server towards reasonable execution plans by rewriting the query, instead of issuing separate queries for individual aggregates? My goal here is to understand where I led the SQL engine down a dark path and how I can write better SQL in the future.