17

So use case is as follows - there're some parameter, based on which I want to select data from one table or another.

create table dbo.TEST1 (id int primary key, name nvarchar(128))
create table dbo.TEST2 (id int primary key, name nvarchar(128))

So I've created function like this:

create function [dbo].[f_TEST]
(
    @test bit
)
returns table
as
return (
    select id, name from TEST1 where @test = 1

    union all

    select id, name from TEST2 where @test = 0
)

When I run it with constant, the execution plan is great - only one table is scanned

select * from dbo.f_TEST(1)

enter image description here

But, then, when I use variable, plan is not that good - both tables are scanned

declare @test bit = 1

select * from dbo.f_TEST(@test)

enter image description here

So are there any hints (or tricks) to force SQL Server to understand that in a certain query only one table should be scanned?

Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
Roman Pekar
  • 107,110
  • 28
  • 195
  • 197
  • What's wrong with the accepted answer, so that you decided to open the bounty? What kind of extra details do you want to see in the answers? – Vladimir Baranov Feb 16 '18 at 05:27
  • Just wanted to know if there're some better techniques which can be used in SQL Server 2016 – Roman Pekar Feb 16 '18 at 11:42
  • As far as I know, `OPTION(RECOMPILE)` is the only hint that essentially replaces parameters with their actual values which allows optimizer to figure out that it doesn't have to touch a certain table. In all other cases when there are parameters involved optimizer has to generate a plan that would be valid for **any** combination of parameters. Such plan would be cached and reused in the future with different parameter values, which means that optimizer can't eliminate any of the tables from the plan. – Vladimir Baranov Feb 16 '18 at 13:51
  • Writing dynamic Sql is the only better way and remove UDF. – KumarHarsh Feb 19 '18 at 10:58

5 Answers5

13

If your function is inline-TVP(as in example) then you could use:

declare @test bit = 1
select * from dbo.f_TEST(@test) OPTION (RECOMPILE);

Then in both cases you will get single clustered index scan.

DBFiddle Demo

From Option RECOMPILE:

When compiling query plans, the RECOMPILE query hint uses the current values of any local variables in the query and, if the query is inside a stored procedure, the current values passed to any parameters.

Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
  • 1
    nice one. I usually really against this option, but can consider it here. – Roman Pekar Aug 15 '17 at 19:22
  • 1
    The answer is good, but the bolded part is kinda wrong :) The problem here is not because the value was passed as parameter to function, the problem is using of variable itself, so the first part of that sentence is more related.. `"the current values of any local variables in the query"` – Nenad Zivkovic Aug 15 '17 at 19:34
  • the only thing I'd like to have is some hint which I could add to function itself so I don't have add recompilation hint to every query... – Roman Pekar Aug 15 '17 at 21:29
  • 1
    @RomanPekar, single-statement TVF is inlined, so you need to add the `RECOMPILE` hint to the main query. You could make it multi-statement TVF and include `RECOMPILE` hint in it, but it may make overall things worse, because it will not be inlined. – Vladimir Baranov Feb 16 '18 at 05:26
  • There is no reason to force recompiles for this. See answer below. – Pittsburgh DBA Feb 20 '18 at 13:18
4

It works fine as is. Take a look at "Number of Executions" on the relevant table, as you change your parameter value. The tables that will be excluded appear in the plan, because they must be considered, but that does not mean they will be scanned.

Number of Executions: 0

Also, look at the Startup Expression on the Filter:

enter image description here

Pittsburgh DBA
  • 6,672
  • 2
  • 39
  • 68
  • this is a great answer, but the problem is that the plan itself is still the same, which in my case doesn't work that well (for example, amount of rows can be very different for different parameter values) – Roman Pekar May 06 '19 at 14:21
2

You may wish to try

select top (@test*100) percent id, name from TEST1 

union all

select top ((1-@test)*100) percent id, name from TEST2
Serg
  • 22,285
  • 5
  • 21
  • 48
1

Use a stored procedure instead of table function, but beware of parameter sniffing. You can use dynamic SQL instead inside a stored procedure to produce the same result you seek using a table function.

This article will explain why what you are doing works the way that it does. https://learn.microsoft.com/en-us/sql/relational-databases/user-defined-functions/user-defined-functions

I'm sure that you want to do something more with the function, which is why you may not want to create a stored procedure instead. There is a way to use the results of a sporc execution in a query. But, that would be a different question than what you've logged here.

kalaolani
  • 323
  • 1
  • 16
-2

OPTION (RECOMPILE) won't help you here when your function is used against a table. This will query, for example, will scan both tables

-- 3rd table to test against
create table dbo.TEST3 (id int primary key, test bit);
insert dbo.TEST3 values(1,1),(2,1),(3,0),(4,1);
GO

select TEST3.* 
from TEST3 
CROSS APPLY dbo.f_TEST(test3.test) 
OPTION (RECOMPILE);

That's okay though. I'm short on time (otherwise I'd include a screenshot) but if you run these three queries with the actual execution plan on you'll see that the optimizer see's these as having the same cost:

DECLARE @test int = 1

select * from dbo.f_TEST(1)
select * from dbo.f_TEST(@test)
select * from dbo.f_TEST(@test) OPTION (RECOMPILE)

The second query will appear that it's twice as expensive as the 1st and last but, when you hover over the SELECT operator, you'll see that it's because the optimizer is estimating two rows instead of 1 (as is the case with the other two).

If you do some performance testing you'll see that, in this case, the optimizer is probably correct.

The bigger issue with your code is that a table scan is guaranteed for each table because you have no filter on either query. Adding a filter, if possible, will enable to you to index this both tables in a way where a seek happens instead of a scan.

Alan Burstein
  • 7,770
  • 1
  • 15
  • 18