Teradata sample 1 vs top 1 performance

Question

I've been running into an issue where select top 1 * from x.y will sit in queue for up to an hour whereas a select * from x.y sample 1 will run within seconds. I've been told that it's because select top is collecting all the data, sorting it and grabbing the top 1 and sample 1 is just grabbing a random row, no sorting the data at all, just a random row.

This explanation doesn't makes sense to me especially since I'm not including an order by in the query and the database usage metrics for sample are orders of magnitude worse.

AMPCPUTime	StartTime	FirstStepTime	FirstRespTime	Spool_Usage_GB	TotalIOCount	QueryText
2.92	10:57:37.52	10:57:37.52	10:57:37.92	0	24248	select * from x.y sample 1;
0.02	9:30:08.39	10:57:37.10	10:57:37.26	0	55	SELECT TOP 1 * FROM x.y;

EXPLAIN SELECT TOP 1 * FROM x.y;

First, we lock x.y in TD_MAP1 for read on a reserved RowHash to prevent global deadlock.
Next, we lock x.y in TD_MAP1 for read.
We do an all-AMPs STAT FUNCTION step in TD_MAP1 from
x.y by way of an all-rows scan with no residual conditions into Spool 5, which is built locally on the
AMPs in TD_Map1. The result rows are put into Spool 1
(group_amps), which is built locally on the AMPs. This step is
used to retrieve the TOP 1 rows. One AMP is randomly selected to retrieve 1 rows. If this step retrieves less than 1 rows, then
execute step 4. The size is estimated with high confidence to be 1 row (594 bytes). The estimated time for this step is 1 minute
and 34 seconds.
We do an all-AMPs STAT FUNCTION step in TD_MAP1 from
x.y by way of an all-rows scan with no residual conditions into Spool 5 (Last Use), which is
redistributed by hash code to all AMPs in TD_Map1. The result
rows are put into Spool 1 (group_amps), which is built locally on the AMPs. This step is used to retrieve the TOP 1 rows. The size is estimated with high confidence to be 1 row (594 bytes). The
estimated time for this step is 1 minute and 34 seconds.
We do a group-AMP SORT to order Spool 1 (group_amps) by the sort
key in spool field1.
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1.

EXPLAIN select * from x.y sample 1;

First, we lock x.y in TD_MAP1 for read on a reserved RowHash to prevent global deadlock.
Next, we lock x.y in TD_MAP1 for read.
We do an all-AMPs SAMPLING step in TD_MAP1 from
x.y by way of an all-rows scan with no residual conditions into Spool 1 (group_amps), which is built
locally on the AMPs. Samples are specified as a number of rows.
The size of Spool 1 is estimated with high confidence to be 1 row (594 bytes).
Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of statement 1.

Related thread: Efficiency of top vs. sample in Teradata

Take a look at EstProcTime in DBQL. I think you will find the TOP query has a conservative, relatively high cost estimate which is causing it to classify into a workload with a much more restrictive throttle limit (and wait to begin execution). On the other hand, the cost estimate for SAMPLE may be very rough / too low. Inexact costing for SAMPLE isn't an issue for query planning since there really are no alternative plans to consider. But it can play havoc with workload management. — Fred, Jan 31 '23 at 19:01
As you said, the TOP query was waiting, check `DeferTime` & `DelayTime`. Plus `WDID` to see if they were assigned to different workloads. — dnoeth, Jan 31 '23 at 19:07

Teradata sample 1 vs top 1 performance

EXPLAIN SELECT TOP 1 * FROM x.y;

EXPLAIN select * from x.y sample 1;

0 Answers0