0

Currently we have a Datawarehouse that is holding data from multiple tenants. SQL server is on version 2019. Same schema for all the tenant databases and the data from all the tenants is consolidated in the Datawarehouse. Data is partitioned in the datawarehouse on Tenant basis. We have parameter sniffing problem with the new dashboard as the data varies a lot between the tenants. Some tenants have data less than 10000 rows and a couple of tenants have data ranging up to 5 million rows. Due to this, dashboard performance is bad for large tenants if the execution plan is built based on a smaller tenant.

Suggestions on the internet are available asking to use Recompile hint or Optimize for hint etc. But I have a doubt on the basics of this parameter sniffing. As statistics are maintained by the SQL server at partition level, is this statistics information not used to see if the plan built is right for a new run time value? Before executing, are stats ever compared for the plans built on compile time and run time to see if they are valid and the associated plan is valid?

Kindly advise.

PraveenDS
  • 51
  • 3
  • It is only going to work if the partition id is statically available at compile time. A cross-partition query is never going to be able to use statistics specifically for one partition. Even a paramter specifying the partition id will not help – Charlieface Mar 16 '21 at 12:04
  • There is only one stats blob per index even if the table is partitioned. When you update stats at the partition level, those are merged into the global stats. For a data warehouse workload, I suggest you just add `OPTION(RECOMPILE)` and be done with it. – Dan Guzman Mar 16 '21 at 12:06
  • @DanGuzman A filtered index or filtered statistics could work, obviously the partition id needs to be statically known – Charlieface Mar 17 '21 at 00:32
  • @Charlieface - when you say make partition id as statically available at compile time, do you mean provide partition id as a parameter to the stored procedures? Would this not recur the problem of parameter sniffing with a different partition id? – PraveenDS Mar 17 '21 at 08:14
  • @DanGuzman - Following things are stopping me from using OPTION(RECOMPILE): 1. Since there are up to 500 tenants and broadly they divide into two (small and large) or three (small, medium and large) categories. So once a plan is available for that category of tenants, all the remaining tenants shall be able to reuse that plan. 2. We are going to have quite a number of reports and dashboards and I am little worried that this recompile hint could start to show impact as we grow once it is used at every place. – PraveenDS Mar 17 '21 at 08:19
  • Yes you can filter on `$Partition.MypartitionFunction(table.PartitionColumn) = 5` for example. But to avoid further issues, that number needs to be placed into the query directly (dynamic SQL), otherwise parameter sniffing can still happen. See https://learn.microsoft.com/en-us/sql/t-sql/functions/partition-transact-sql?view=sql-server-ver15 – Charlieface Mar 17 '21 at 10:44
  • @PraveenDS, not sure I understand what you mean about #1. The plan will not be reused because of the recompile query hint so each tenant will get the most optimal plan (theoretically) for their size. Recompile cost is typically milliseconds so, for a dashboard application like your question, the runtime savings can be quite significant. It should be easy to [race your horses](https://ericlippert.com/2012/12/17/performance-rant/) with a proof-of-concept. – Dan Guzman Mar 17 '21 at 10:53
  • Sure @DanGuzman. Thank you for the details. Regarding #1, I meant the ability to reuse a plan for a category of tenants without requiring recompiling within each category. If we have this option available or make it possible through any setting, it would have been wonderful. But yeah, understood that there is no such solution available currently that can be turned on and Recompile hint is the simple and easy solution to tackle this. – PraveenDS Mar 19 '21 at 07:19
  • @Charlieface - As much as possible, we are refraining from using Dynamic SQL in the code. Hence I may have to apply Recompile hint if passing the partitionid requires dynamic sql. – PraveenDS Mar 19 '21 at 07:22
  • Well you need some way of differentiating between the two plans. Perhaps one option is to write two queries, one that forces one plan and one the other. Then look up statistics on the partition you need, and depending on that choose the query to run – Charlieface Mar 19 '21 at 09:07
  • Yeah @Charlieface, but I need someway to differentiate the two plans with the same query based on the tenant size. As you said, with two procs or queries we can achieve this but it adds a lot of maintenance now and in the future as the same exercise has to be repeated for each and every report. I was hoping Statistics on Partitions can help without requiring two copies of the same query but probably not. Few options I have tried are: use DB snapshot for a category of tenants and use the actual DB for the second set. But that seems to be overkill on the system. – PraveenDS Mar 25 '21 at 08:22
  • Besides, I've read different set options can cause the compiler to take a different plan but I am not able to pass set values in connection string. Is there a way to Pass Set options like Ansi_nulls in connection string? or do you have any other suggestion to achieve this with a different connection string for each category of tenants? – PraveenDS Mar 25 '21 at 08:23
  • No, you have to run `SET` as part of the batch, any case that won't help you as you still have no way of differentiating the two plans (unless you force a plan guide, which I don't advise). As I said, you could embed the partition number in the query with dynamic SQL, while still passing through the rest of the parameters as normal. It's pretty safe as `$partition` is an `int`, so not really any risk. This way you only get a recompile per tenant, not per run of the query. You may find that not every report needs all this anyway. What is your client app coded in? – Charlieface Mar 25 '21 at 10:29

1 Answers1

0
  1. Embed the Partition number or the TenantID key in the query text

Parameters are for when you want shared, reused query plans. Hard-coding the criteria that cause query plans to vary is the basic right answer here.

And even though "As much as possible, we are refraining from using Dynamic SQL in the code", you should make an exception here.

  1. Use OPTION RECOMPILE

If you don't end up spending too much time in query optimization, this is almost as good. Or

  1. Add a comment into the query that varies by tenant or tenant size to get a partitioned plan cache. This is also useful for correlating queries to the code paths that generate them. eg
/* Dashboard: Sales Overview
   Visual: Total Sales
   TenantID: 12345    */   
select sum(Sales) TotalSales
from T
where TenantId = @tenantId
David Browne - Microsoft
  • 80,331
  • 6
  • 39
  • 67