2

I have a scalar-valued function called DATEONLY that returns DATEADD(DD,0, DATEDIFF(DD,0, @DATETIME)), just like this:

CREATE FUNCTION [DBO].[DATEONLY] ( @DATETIME DATETIME )
RETURNS DATETIME
BEGIN
    RETURN DATEADD(DD, 0, DATEDIFF(DD, 0, @DATETIME))
END

When I select a table using my function, the SQL Server Profiler counts a high RowCounts number than if I used directly DATEADD(DD,0, DATEDIFF(DD,0, @DATETIME)).

In my Dropxbox's Public Folder you can find a script.sql that can reproduce what I am talking about and you can also find a Trace.trc from my SQL Server Profiler.

script.sql: https://www.dropbox.com/s/gwbh54jqas7fhhc/script.sql

trace.trc: https://www.dropbox.com/s/gwbh54jqas7fhhc/Trace.trc

Just to simplify, look the RowCounts below.

SELECT DATEADD(DD,0, DATEDIFF(DD,0, INCOMING)) AS DATA, COUNT(*) AS SOULS
FROM HELL
GROUP BY DATEADD(DD,0, DATEDIFF(DD,0, INCOMING))

RowCounts = 6

SELECT DBO.DATEONLY(INCOMING) AS DATA, COUNT(*) AS SOULS
FROM HELL
GROUP BY DBO.DATEONLY(INCOMING)

RowCounts = 32

In my real scenario, those 32 rows turns into millions RowCounts. If they are THE SAME THING, what's going on?! How can i optimize that to prevent changing my whole application?

Thanks a lot!

Eduardo Maia
  • 587
  • 1
  • 5
  • 10

3 Answers3

3

Scalar-valued user-defined function execution is not very efficient in sql server - it essentially does a separate execution call for each invocation, which is every row in the table. Adam Machanic has a good post on this subject which describes scalar udf execution and how inline table-valued function execution can be much faster.

One can rewrite your query to take advantage of the logic within a tvf instead, which the optimizer executes using the same query plan as your original expanded query and shows the same RowCounts=5 during execution.

CREATE FUNCTION [DBO].[DATEONLY2] ( @DATETIME DATETIME ) RETURNS TABLE
AS RETURN SELECT DATEADD(DD, 0, DATEDIFF(DD, 0, @DATETIME)) data

select data, count(*) as souls
from
(SELECT (select data from dbo.dateonly2(incoming)) data
FROM HELL) t
GROUP BY data
stevehem
  • 671
  • 3
  • 6
1

This is because the UDF is evaluated every time for each row. What you are experiencing is documented in the blog post: http://blogs.msdn.com/b/sqlserverfaq/archive/2009/10/06/performance-benefits-of-using-expression-over-user-defined-functions.aspx

0

I would suggest to use the following query

SELECT DATEADD(DD,0, DATEDIFF(DD,0, INCOMING)) AS DATA, COUNT(*) AS SOULS
FROM HELL
GROUP BY DATEADD(DD,0, DATEDIFF(DD,0, INCOMING))

you can create index on incoming column to avoid the performance degration.

Yogesh Bhadauirya
  • 1,225
  • 10
  • 10