3

In my past experience, I've always used functions in simple cases where I need to select a dataset, without a lot of complex logic, and I need to also pass a parameter.

I've recently been informed that I should avoid using functions in MSSQL at all costs, because they often cause performance issues, and sometimes their use causes indexing to not me used properly. Can anyone speak to this point, and explain in further detail if this is true, and some reasoning behind it?

wakurth
  • 1,644
  • 1
  • 23
  • 39
  • I'd avoid any advice that says "avoid X for performance reasons". *If* you're experiencing performance issues in a particular circumstance, than start looking into the possible causes. If, currently, the functions are the clearest way of expressing your intent, and *aren't* causing an actual performance issue, I'd leave them be. – Damien_The_Unbeliever Aug 08 '12 at 13:24
  • Without a specific example of a query, this question is meaningless. As with any feature in SQL (or almost any language), there are times when using functions is appropriate and times when using functions is not appropriate. – Gordon Linoff Aug 08 '12 at 13:28

2 Answers2

3

You have been advised naively.


Scalar Functions

WHERE dbo.fn_get_year(tbl.field) = 2012 will obfuscate tbl.field and make any index on it unusable.

You will find much better performance with, for example, WHERE tbl.field >= '20120101' AND tbl.field < '20130101'.

In the first example, every record has to be processed, because the optimiser can't see through the function and infer which range of records will fit the criteria.

In the second example, you make it very clear you want a continuous block of records from point a to point b. This enables the optimiser to use the index for a range seek.


Table Valued Functions

All of that is very different from SELECT * FROM dbo.my_function(@parameter) AS data. There is nothing wrong with table valued functions being used in that way.

A complexity comes when joining the results of the function to another table or function.

If the function is multi-statement (with IF blocks, etc, etc) then the whole result set of the function is returned before the join is processed.

If the function is an inline-function (with just a RETURNS TABLE AS SELECT blah FROM blah) then SQL Server treats it as a macro (unless you tell it not to). This means that your function code is substituted into your query, and a brand new execution plan is built for your query. That may mean that only the relevant records from your function are ever processed due to index optimisation, etc.


In short, ask the person who advised you to be exceptionally specific about their advice. If it remains never use functions just ignore them.

BenMorel
  • 34,448
  • 50
  • 182
  • 322
MatBailie
  • 83,401
  • 18
  • 103
  • 137
  • Thank you for your feedback sir. It is much appreciate. This is what I suspected. The unfortunate part is I *have* to avoid them at all costs anyways (e.g. just throwing it in a stored proc) because the DBA threw threw the gable down on me. It is nice to know that what I've learned up until this point isn't actually wrong, and that the DBA is just being 'particular' about how they wants this done. – wakurth Aug 08 '12 at 15:38
  • @wakurth - Normal behaviour is to encapsulate the logic in a TVF and write SPs to select from them. The TVFs are maintenance tools, and the SPs are access control tools. – MatBailie Aug 08 '12 at 15:43
  • what you just described is EXACTLY what I did, and I was asked to change it to encapsulating logic in a SP and the write an SP to that executes a select instead of the TVF... this is what drove me to ask this question on stack. – wakurth Aug 08 '12 at 19:20
1

IMO the anti practice is to use scalar functions in the WHERE clause of a query, without any other filters which would give SQL good selectivity.

e.g.

   SELECT columns
   FROM [table]
   WHERE dbo.myFunc(col1) = 55

Will usually result in a table scan, irrespective of the indexing on col1.

As the others have pointed out, there are exceptions, e.g. it is possible to use deterministic, schema bound functions in an indexed computed column.

As an example, consider the following deterministic function:

CREATE FUNCTION dbo.myFunc(@id int)
returns int
WITH SCHEMABINDING
AS
    BEGIN
        return (@id + 1)
    END

Given the table (with the MSSQL default PK = Clustered Index)

CREATE TABLE MyTable
(
    ID INT Identity (1,1),
    SomeOtherColumn VARCHAR(50),

    CONSTRAINT PK_MyTable PRIMARY KEY(ID)
)

Populated with ~100k records

select * from MyTable where ID < 100 -- Index Seek :)

However, running the scalar function doesn't get the benefit of the Clustered Index

select * from MyTable where dbo.MyFunc(Id) < 100 -- Index Scan :(

Using the scalar func as a basis for a computed column

alter table MyTable add Computed as dbo.MyFunc(ID)

select * from MyTable where Computed < 100 -- Still Index Scan :(

-- However, because the Computed column is deterministic and schema bound, it can be indexed:

CREATE INDEX IX1_MyTable on MyTable(Computed) 

select * from MyTable where Computed < 100 -- Index Seek :)

Interestingly enough, applying the function now results in an index seek (SQL 2008R2)

select * from MyTable where dbo.MyFunc(ID) < 100 -- Index Seek :)
Community
  • 1
  • 1
StuartLC
  • 104,537
  • 17
  • 209
  • 285