In Teradata
, which is more efficient - Sample or Top?
Teradata
to do less work and result in faster returns?
Routinely, I just want to see a few rows.
Thread related: differences between top and sample in teradata sql
In Teradata
, which is more efficient - Sample or Top?
Teradata
to do less work and result in faster returns?
Routinely, I just want to see a few rows.
Thread related: differences between top and sample in teradata sql
I just ran three queries on a large Teradata dataset:
SELECT * FROM table Sample 10;
SELECT * Top 10 FROM table;
(with no order by)SELECT * Top 10 FROM table ORDER BY column;
The DBQL metrics show that by far, the Top 10 with no order is the least resource-intensive. I had about a 99% drop in I/O & CPU just changing from SAMPLE to TOP.
So if your goal is purely efficiency, then TOP without the Order by is the clear winner going by TD's DBQL metrics.
TOP usually is the faster one, taking the shortcut out of processing when enough data has been returned.
Sample has a lot more options and works to be statistically correct. To achieve that, it may have to read more data blocks.
From an efficiency standpoint:
Using Top
WITHOUT an Order By
will be about the same as Sample
.
Using Top
WITH an Order By
will cause more overhead.
I think its better to use TOP
without order by than sample
. Both of these evaluate the whole expression before displaying the result.