3

I'm using the latest versions of SQL Server and EF Core in my .NET 7 project.

I have a table Deposits with 4 million rows. I encountered a small issue and I'd like to know how to address it.

I have a very simple query:

var deposits = await dataContext.Deposits
                                .Where(d => d.OwnerId == userId
                                            && d.DepositReason == DepositReason.Purchase)
                                .Select(p => p.Id)
                                .OrderByDescending(id => id)
                                .ToListAsync();

I noticed that it takes a very long time to execute (slightly over 10 seconds).

After some thinking, I swapped the conditions in the query:

Before:

d.OwnerId == userId
&& d.DepositReason == DepositReason.Purchase

After:

d.DepositReason == DepositReason.Purchase
&& d.OwnerId == userId

The query started executing in about 8 milliseconds (attached a screenshot).

Log

Okay, I built and deployed it to production, everything was fine, and the query was executing quickly. However, after a few hours, it seems that SQL Server changed the execution plan or something similar, and the query started taking around 11 seconds to execute again.

Moreover, through SSMS it has no difference if I execute

select * 
from deposits
where DepositReason = 2 
  and OwnerId = 1

or

select * 
from deposits
where OwnerId = 1 
  and DepositReason = 2

It's always fast.

I would really like to know how to deal with such an anomaly. Please, if anyone understands how this works and what can be done, share your knowledge. Thank you!

The slowest query:

SELECT [d].[Id]  FROM [Deposits] AS [d]  WHERE [d].[OwnerId] = @__userId_0 AND [d].[DepositReason] = CAST(2 AS tinyint)  ORDER BY [d].[Id] DESC

Execution plan

Charlieface
  • 52,284
  • 6
  • 19
  • 43
  • Do you have index on those two fields? – siggemannen Aug 04 '23 at 17:10
  • @siggemannen sure I have indexes – Nikita Khromov Aug 04 '23 at 17:15
  • @the-impaler Please refer to the attached screenshots, that's why I said "anomaly" – Nikita Khromov Aug 04 '23 at 17:15
  • SQL Server reports the slowest queries -- execution plans included. Ask the DBA to retrieve it and add it to the question, so we can look at it. – The Impaler Aug 04 '23 at 17:17
  • @the-impaler done – Nikita Khromov Aug 04 '23 at 17:22
  • @the-impaler Index key columns OwnerId bigint 8 Included columns Amount int – Nikita Khromov Aug 04 '23 at 17:28
  • @the-impaler I'm the DBA that's why I'm here :) I guess I have to force some "good" execution plan or something like this. Just don't know where to start if I'm not wrong – Nikita Khromov Aug 04 '23 at 17:33
  • 2
    It's probably a parameter sniffing issue where some `OwnerId` have many more rows in `deposits` than others do. And if the plan gets compiled for one of the ones with few rows then it compiles a plan with the non covering index seek and lookups and if it gets compiled for a different `OwnerId` it does a scan and maybe sort as you have an `ORDER BY` here. You need to get the execution plan for both "slow" and "fast" cases. This is something you can get from Query Store – Martin Smith Aug 04 '23 at 17:54
  • 4
    SSMS queries with constant (literal) values are irrelevant, since EF Core produces parameterized queries (as it should), and SqlServer is known to suffer the so called "parameter sniffing" problem. As others mentioned, the order of predicates in `where` clause does not matter. Also, looks like you have no index which can cover both your predicates, so the one for FK always is used, regardless of the cardinality of the other predicate value. May be you should look and add some plan affecting hints. In any case, this is pure SqlServer issue and has nothing to do with EF Core. – Ivan Stoev Aug 04 '23 at 17:57
  • 1
    @IvanStoev yes, I'm sure it's an sql server issue, thank you, I will check the hints and what to do with them – Nikita Khromov Aug 04 '23 at 18:01
  • 1
    The problem with EF Core though would be how to let it add the hints to the query :) Here we could eventually help. At least adding `option (recompile)` which usually solves such issues. – Ivan Stoev Aug 04 '23 at 18:03
  • 2
    _As others mentioned, the order of predicates in where clause does not matter._ Well, not really. All other things being equal then it shouldn't matter at all. Except for the exceptions, like: changing even a single byte in a T-SQL statement (even whitespace) means that it hashes differently, which means that it gets a seperate plan in the execution plan cache, which means that SQL Server has the opportunity to generate a completely different plan for it based on the current statistics and cardinality estimates at the time. – AlwaysLearning Aug 04 '23 at 22:56
  • If you'd like to learn more about SQL Server's execution planning and caching mechanisms have a read through Erland Sommarskog's excellent article, [Slow in the Application, Fast in SSMS? Understanding Performance Mysteries](https://www.sommarskog.se/query-plan-mysteries.html) – AlwaysLearning Aug 04 '23 at 23:00
  • As mentioned above it looks like parameter sniffing and [solution](https://stackoverflow.com/a/73077746/10646316) with recompile. – Svyatoslav Danyliv Aug 05 '23 at 04:04
  • @SvyatoslavDanyliv Then the question is how frequently should I recompile it. It works fast for several hours, then slows down for another few hours. And so it goes in a circle. – Nikita Khromov Aug 05 '23 at 08:52
  • Actually always, you should execute this query with recompile option. Also please check update section of my answer. – Svyatoslav Danyliv Aug 05 '23 at 09:18
  • From other side, if `userId` is Guid - there can be issue with non monotonic unique identifiers. – Svyatoslav Danyliv Aug 05 '23 at 09:27
  • @SvyatoslavDanyliv long – Nikita Khromov Aug 05 '23 at 09:38

1 Answers1

0

Your issue is primarily due to a poor execution plan. The server is struggling to decide between using an index to lookup based on OwnerId and then do further key lookups to the clustered index, or to just scan the whole clustered index.

Parameter sniffing can exacerbate this issue, as that means that whichever value is queried first is the one being used to compile the query. You are sometimes getting a recompile, usually due to the query eventually dropping out of the plan cache.

RECOMPILE could fix that by recompiling every time, but it's really just a patch, and is not really fixing the root cause. It also has a significant CPU cost every time you run the query.

The real answer is to fix your indexing. For your query to perform well, you need to change your IX_Deposits_OwnerId index:

(OwnerId, DepositReason, ID DESC) INCLUDE (all_other_columns_here)

Unfortunately, because you are bringing back every column, you need to INCLUDE all those columns in your index. If you can limit the columns you bring back then you can add only those.

Charlieface
  • 52,284
  • 6
  • 19
  • 43