Function Efficiency

Question

I am a neophyte to creating stored procedures and functions and I just can't figure out why one of these versions runs so much faster than the other. This is a function that just returns a string with a description when called. The original function relies on supplying about 10 variables (Version running in about 4 seconds). I wanted to cut that down to a single variable (version running long).

The code below the declaration of the variables is identical, the only difference is that I'm attempting to pull the variables from the appropriate within the function itself rather than having to supply them on the query side.

i.e. dbo.cf_NoRateReason(V1) as ReasonCode rather than dbo.cf_NoRateReason(V1,V2,V3,V4,V5,V6,V7,V8,V9,V10,V11,V12)

I apologize up front if I am not supplying enough information, as I said, new to functions/stored procedures.

This version runs in about 2.5 minutes to run

    declare @Agencyid int
    declare @ServiceCode varchar(10)
    declare @Mod1 varchar(2)=null
    declare @Mod2 varchar(2)=null
    declare @Mod3 varchar(2)=null
    declare @Mod4 varchar(2)=null
    declare @POS int
    declare @ServiceDate datetime
    declare @ProvType varchar(1)
    declare @PayerID int
    declare @BirthDate datetime
    declare @RenderingStaffID int 
    declare @SupervisingStaffID int 
    Select @Agencyid=s.agencyid, @ServiceCode = ServiceCode, 
           @Mod1 = ModifierCodeId, @Mod2 = ModifierCodeId2, 
           @Mod3 = ModifierCodeId3, @Mod4 = ModifierCodeId4,
           @POS=PlaceOfServiceId, @ServiceDate = ServiceDate, 
           @RenderingStaffId=isnull(dbo.GetProviderStaffId('S',s.ServiceTransactionId,'82'),0),
           @SupervisingStaffId=isnull(dbo.GetProviderStaffId('C',ClaimId,'DQ'),0),
           @ProvType=s.servicetype, @Payerid=pmt.payerid,
           @BirthDate=i.birthdate
      From ServiceTransaction s
            join individual i on s.servicetransactionid = i.individualid
            join pmtadjdetail pmt on s.servicetransactionid = pmt.servicetransactionid

    declare @Result Varchar(100) = ''
    declare @Age int = dbo.getageatservicedate(@birthdate, @ServiceDate)
    declare @ModString varchar(8) = dbo.sortmodifiers(@Mod1, @Mod2, @Mod3, @Mod4)
    declare @DirectSupervision int = (iif(@Mod1 in ('U1','U6','U7','U9','UA') 
        or @Mod2 in ('U1','U6','U7','U9','UA') 
        or @Mod3 in ('U1','U6','U7','U9','UA')
        or @Mod4 in ('U1','U6','U7','U9','UA'),1,0))

'************************************************************************************'
'This version takes about 4 seconds to run'
'************************************************************************************'
begin
    declare @Result Varchar(100) = ''
    declare @Age int = dbo.getageatservicedate(@birthdate, @ServiceDate)
    declare @RenderingStaffID int = dbo.getstaffid(@STID,'DQ')
    declare @SupervisingStaffID int = dbo.getstaffid(@STID,'82')
    declare @ModString varchar(8) = dbo.sortmodifiers(@Mod1, @Mod2, @Mod3, @Mod4)
    declare @DirectSupervision int = (iif(@Mod1 in ('U1','U6','U7','U9','UA') 
        or @Mod2 in ('U1','U6','U7','U9','UA') 
        or @Mod3 in ('U1','U6','U7','U9','UA')
        or @Mod4 in ('U1','U6','U7','U9','UA'),1,0))

The version you listed first **has no filter** (*no WHERE clause*) on the SELECT it uses to get the "parameter" values it is normally passed. You're effectively getting the entire join resultset, with the cost of the function calls for every result row, and only taking last result's values. — Uueerdo, Oct 17 '19 at 16:35
Scalar functions are horrible for performance in the best of situations. Better to use table valued parameters or inline table valued functions. — Sean Lange, Oct 17 '19 at 16:41
your first has a query in it starting 'Select @Agencyid=s.agencyid, @ServiceCode = ServiceCode, ' etc You might want to find out how long that takes — Cato, Oct 17 '19 at 16:43
Thank you Uueerdo, you are a genius too! that was exactly the problem — user194104, Oct 17 '19 at 16:54

Uueerdo · Answer 1 · 2019-10-17T17:02:44.047

This kind of falls under "typo" or simple oversight, but....

When you see that big of a performance difference, for no discernible reason (those functions were used in the original version as well), that is usually when you need to start look for these kinds of mistakes: typos, missing conditions, incorrect conditions from leaning too hard on intellisense/code-completion, etc...

When replacing multiple parameters with one that can used to retrieve the others automatically, always make sure to actually use that parameter.

The version you listed first has no filter (no WHERE clause) on the SELECT it uses to get the "parameter" values it is normally passed. You're effectively getting the entire join resultset, with the cost of the function calls for every result row, and only taking last result's values.

score 0 · Answer 2 · answered Oct 17 '19 at 16:31

0

You are correct - the only difference is using the function. Please see similar questions where this has been addressed.

In short, functions are going to be performed on a row-by-row basis whereas code on the query side is going to have other options with no overhead calls to the function. You may be able to use a scalar function with schema binding and nulls return nulls for better performance.

Additional consideration for the schema plan would be valuable. There are also joins and other embedded logics here that aren't clear without sample data.

answered Oct 17 '19 at 16:31

Sean Brookins

574
4
12

It seems more likely that his query starting 'Select @Agencyid=s.agencyid, @ServiceCode = ServiceCode, ' is the problem , especially as Uueerdo has pointed out, it may be returning many rows, but then only retaining values for the final row that happens to be returned (effectively any of them). – Cato Oct 17 '19 at 16:45
Thanks, I will read what you are suggesting. I am using a similar process to another scalar valued function within the database that runs *way* more quickly than mine and I just can't see a difference between what I'm doing and what the other function is doing in terms of setting variables which is where the hangup seems to be. – user194104 Oct 17 '19 at 16:50
check the comment from Uueerdo - unless there are tables with single rows, it seems that the query in the 1st proc is returning multi records - you surely need to narrow down to a single row - isolate the query in a SSMS window, change it to SELECT *, then see how many rows it gets – Cato Oct 17 '19 at 16:53
Cato, you are a genius and I am an idiot. You were 100% correct, I was not limiting the returned dataset to a single record (using the @STID variable). It now runs and runs in 1 second instead of 4! – user194104 Oct 17 '19 at 16:53
although experience told me to look at the query, it was Uueerdo who pointed out that it returned multi rows - so he is the genius - good luck with your project! – Cato Oct 17 '19 at 16:54
1

@Cato appears to be correct; I'll leave this up for the comment discussion since it's relevant. Sorry about the misunderstanding. – Sean Brookins Oct 17 '19 at 17:38

Function Efficiency

2 Answers2