5

I have a query like this on Sql Server 2008:

DECLARE @START_DATE DATETIME
SET @START_DATE = GETDATE()

SELECT * FROM MY_TABLE
WHERE TRANSACTION_DATE_TIME > DATEADD(MINUTE, -1440, @START_DATE)

In the select query that you see above, does SqlServer optimize the query in order to not calculate the DATEADD result again and again. Or is it my own responsibility to store the DATEADD result on a temp variable?

Ahmet Altun
  • 3,910
  • 9
  • 39
  • 64

3 Answers3

10

SQL Server functions that are considered runtime constants are evaluated only once. GETDATE() is such a function, and DATEADD(..., constant, GETDATE()) is also a runtime constant. By leaving the actual function call inside the query you let the optimizer see what value will actually be used (as opposed to a variable value sniff) and then it can adjust its cardinality estimations accordingly, possibly coming up with a better plan.

Also read this: Troubleshooting Poor Query Performance: Constant Folding and Expression Evaluation During Cardinality Estimation.

@Martin Smith

You can run this query:

set nocount on;
declare @known int;
select @known = count(*) from sysobjects;
declare @cnt int = @known;
while @cnt = @known
    select @cnt = count(*) from sysobjects where getdate()=getdate()
select @cnt, @known;

In my case after 22 seconds it hit the boundary case and the loop exited. The inportant thing is that the loop exited with @cnt zero. One would expect that if the getdate() is evaluated per row then we would get a @cnt different from the correct @known count, but not 0. The fact that @cnt is zero when the loop exists shows each getdate() was evaluated once and then the same constant value was used for every row WHERE filtering (matching none). I am aware that one positive example does not prove a theorem, but I think the case is conclusive enough.

Remus Rusanu
  • 288,378
  • 40
  • 442
  • 569
  • The first link seems to imply `getdate()` is only evaluated once per **query** but this doesn't loop indefinitely `SET NOCOUNT ON; WHILE EXISTS(SELECT 1 FROM sysobjects WHERE GETDATE() = GETDATE()) PRINT 'Yes'` – Martin Smith Sep 04 '11 at 18:49
  • @Martin: I think I know what's going on. Inspecting the plan XML shows that your query has *two* const expressions, each one evaluated once (in my showplan XML they were called `ConstExpr1080` and `ConstExpr1081`). The values are always different, since they are evaluated at distinct moments, but due to the low precision of the datetime type (3ms) they'll actually be the same value a lot of times. Every now and then the evaluation of first getdate and second getdate happen to fall on different sides of a rounding boundary and then the loop stops. – Remus Rusanu Sep 04 '11 at 21:07
  • So it seems that the caching happens per function reference rather than having one function call with the result re-used in all places in the query then as Conor's post suggests. Or are you of the opinion that I have misunderstood what he is saying and that shouldn't apply in this instance? – Martin Smith Sep 04 '11 at 21:17
  • An entire expression is evaluated before the query is run. If runtime constants appear multiple times in the expression, they are evaluated multiple times. Eg. `select datediff(ms, getdate(), getdate())+datediff(ms, getdate(), getdate()) from sysobjects` will evaluate `getdate()` four times. – Remus Rusanu Sep 04 '11 at 21:26
  • Yes, I agree that's how it works. I just thought the linked article was trying to state that something different *ought* to be happening - as it talks about referencing it more than once in a query, gives examples with multiple instances of `getdate()` and seems to be saying that the purpose of “runtime constants” is to avoid the situation where you could get more than one call with slightly different time values. Probably just me then! – Martin Smith Sep 04 '11 at 22:22
9

Surprisingly, I've found that using GETDATE() inline seems to be more efficient than performing this type of calculation beforehand.

DECLARE @sd1 DATETIME, @sd2 DATETIME;
SET @sd1 = GETDATE();

SELECT * FROM dbo.table
WHERE datetime_column > DATEADD(MINUTE, -1440, @sd1)

SELECT * FROM dbo.table
WHERE datetime_column > DATEADD(MINUTE, -1440, GETDATE())

SET @sd2 = DATEADD(MINUTE, -1440, @sd1);

SELECT * FROM dbo.table
WHERE datetime_column > @sd2;

If you check the plans on those, the middle query will always come out with the lowest cost (but not always the lowest elapsed time). Of course it may depend on your indexes and data, and you should not make any assumptions based on one query that the same pre-emptive optimization will work on another query. My instinct would be to not perform any calculations inline, and instead use the @sd2 variation above... but I've learned that I can't trust my instinct all the time and I can't make general assumptions based on behavior I experience in particular scenarios.

Aaron Bertrand
  • 272,866
  • 37
  • 466
  • 490
  • 1
    +1 that is surprising. Just looked at it and I think this is because SQL Server doesn't do variable sniffing so will always assume that `> @variable` will match 30% of the rows (and the costs shown in the plan are just based on this estimate). It seems to be more accurate for the `DATEADD(MINUTE, -1440, getdate())` version at least at the time the plan is compiled. – Martin Smith Sep 04 '11 at 18:28
  • 1
    One potential issue/edge case with `GetDate()` queries is that it is evaluated at compile time and selectivities can change hugely just through passage of time [without triggering a recompile](http://stackoverflow.com/a/9905880/73226) – Martin Smith Mar 28 '12 at 12:11
3

It will be executed just once. You can double check it by checking execution plan ("Compute Scalar"->Estimated Number of execution == 1)

a1ex07
  • 36,826
  • 12
  • 90
  • 103