4

I have this query :

select top 100 id, email, amount from view_orders
    where email LIKE '%test%' order by created_at desc

It takes less than a second to run.

Now I want to parameterize it :

declare @m nvarchar(200)
set @m = '%test%'
SELECT TOP 100 id, email, amount FROM view_orders
    WHERE email LIKE @m ORDER BY created_at DESC

After 5 minutes, it's still running. With any other kind of test on parameters (if I replace the "like" with "="), it falls down to the first query level of performance.

I am using SQL Server 2008 R2.

I have tried with OPTION(RECOMPILE) , it drops to 6 seconds, but it's still much slower (the non-parameterized query is instantaneous). As it's a query that I expect will be run often, it's an issue.

The table's column is indexed, but the view is not, I don't know if it can make a difference.

The view joins 5 tables : one with 3,154,333 rows (users), one with 1,536,111 rows (orders), and 3 with a few dozen rows at most (order type, etc). The search is done on the "user" table (with 3M rows).

Hard-coded values :
Execution plan using hard-coded values

Parameters :
Execution plan using parameters

Update

I have run the queries using SET STATISTICS IO ON. Here are the result (sorry I don't know how to read that) :

Hard-coded values:

Table 'currency'. Scan count 1, logical reads 201.

Table 'order_status'. Scan count 0, logical reads 200.

Table 'payment'. Scan count 1, logical reads 100.

Table 'gift'. Scan count 202, logical reads 404.

Table 'order'. Scan count 95, logical reads 683.

Table 'user'. Scan count 1, logical reads 7956.

Parameters :

Table 'currency'. scan count 1, logical reads 201.

Table 'order_status'. scan count 1, logical reads 201.

Table 'payment'. scan count 1, logical reads 100.

Table 'gift'. scan count 202, logical reads 404.

Table 'user'. scan count 0, logical reads 4353067.

Table 'order'. scan count 1, logical reads 4357031.

Update 2

I have since seen a "force index usage" hint :

SELECT TOP 100 id, email, amount
FROM view_orders with (nolock, index=ix_email)
WHERE email LIKE @m
ORDER BY created_at DESC

Not sure it would work though, I don't work at this place anymore.

thomasb
  • 5,816
  • 10
  • 57
  • 92
  • You should specify what SQL database you are using (and what storage engine when there are multiple to choose from), because runtime-behavior and optimization capabilities depend on that. – Philipp Oct 25 '13 at 10:28
  • Thanks you're right, I have added it (I'm under SQL 2008 R2) – thomasb Oct 25 '13 at 10:30
  • Try with `OPTION (RECOMPILE)` http://www.sommarskog.se/dyn-search-2008.html – Tim Schmelter Oct 25 '13 at 10:32
  • 1
    Just so that you know, when you use an '=', wildcards ('%') won't work as a wildcard – Schalk Oct 25 '13 at 10:32
  • OPTION(RECOMPILE) drops the execution time to about 6 seconds. It's still much slower. @Schalk : I know, it's just to say that only the "like" operator has this performance issue. – thomasb Oct 25 '13 at 10:38
  • Yes, variables suck. Move along, nothing to see here. – SQLMason Oct 25 '13 at 14:44
  • 1
    You may get more insight into what's causing the different behaviour by running each query with SET STATISTICS IO ON to see which tables are being hit, how many times, for how many pages and how much read-ahead. – dav1dsm1th Oct 25 '13 at 15:23
  • Thanks, I have updated the question with this data (sorry, the messages are in french, but you should be able to guess what is said) – thomasb Oct 28 '13 at 09:01

5 Answers5

3

It could be a parameter sniffing problem. Better indexes or a full text search are the way to go but you might be able to get a workable compromise. Try:

SELECT TOP 100 A, B, C FROM myview WHERE A LIKE '%' + @a + '%'
OPTION (OPTIMIZE FOR (@a = 'testvalue'));

(like Sean Coetzee suggests, I wouldn't pass in the wildcard in the parameter)

MLT
  • 524
  • 7
  • 19
  • `OPTION RECOMPILE, OPTIMIZE FOR` gets the execution time to about 35 seconds, and just `OPTIMIZE FOR` gets it to 30 seconds. How do I get "better indexes" ? I will look into full text search. – thomasb Oct 25 '13 at 12:39
  • have you tried rebuilding statistics on the large tables/indexes? – MLT Oct 25 '13 at 12:55
  • I have rebuilt the indexes on the whole database (using `ALTER INDEX ALL ON ' + @TableName + ' REBUILD WITH (FILLFACTOR = 80)`). Weirdly, the execution is now taking much longer (it's now 1 minute and counting) – thomasb Oct 25 '13 at 13:22
  • Are there any other criteria you could add as where clauses to reduce the number of rows? – MLT Oct 25 '13 at 14:41
  • Not really. Plus, the problem arises only with the parameter, not when hard-coding the value. For the moment, I have modified the query generation to hard-code the value every time, it's ugly but it works. – thomasb Oct 25 '13 at 14:57
  • My only other suggestion is to try forcing it to use the 'IX_tbl_User_Mast_U_Cout' index with a query hint (be aware though that overruling the optimiser is usually a very bad idea - parameters and data change over time) – MLT Oct 25 '13 at 15:14
  • Just as a side point, what was your fillfactor before? If it was 100, rebuilding with fillfactor of 80 is likely to make it go slower as your pages are now 20% empty, so SQL server has to do more reads to get the same number of rows, plus you're using more disk space. – steoleary Oct 30 '13 at 11:59
0

You will definetly win when you add an index to the A column. Some time the index suggestion can be borrowed by SQL Server management studio. Paste you query and press Display Estimated Execution Plan button

Yaugen Vlasau
  • 2,148
  • 1
  • 17
  • 38
  • how does this explain why the first query runs fast and the second one does not? Besides what use is an index if you are comparing strings with `LIKE '%....'` – DrCopyPaste Oct 25 '13 at 10:32
  • I think that in first case the query is hardcoded in this case the server can make an optimization under the hood. – Yaugen Vlasau Oct 25 '13 at 10:34
  • it cannot it could only if the wildcard was not at the beginning of the search string http://myitforum.com/cs2/blogs/jnelson/archive/2007/11/16/108354.aspx – DrCopyPaste Oct 25 '13 at 10:35
  • The column is already indexed (I have added that precision to the question). – thomasb Oct 25 '13 at 10:45
  • @DrCopyPaste: intersting article, but when I see at the execution plan of a sample I wrote I see that the indexes do used. One intersting detail. when I drop the index at A column the exeution plan with parametrized like clause seems complete the same as a plan with hardcoded like clause. In case of index I have completly different execution plan. in this plan I see Index Seek activity against the index. – Yaugen Vlasau Oct 25 '13 at 10:53
  • Interesting indeed, how can an `Index` get `Seek`ed when there is a wildcard at the beginning of the searched string? Seems awkward to me. – DrCopyPaste Oct 25 '13 at 11:01
  • @DrCopyPaste:what about the following command: SELECT * from Department where CHARINDEX(@mask, DeptName)<> 0. whould it take be faster than like command? – Yaugen Vlasau Oct 25 '13 at 11:04
  • http://stackoverflow.com/a/18066784/2186023 this seems to answer why there is an `Index Seek` here – DrCopyPaste Oct 25 '13 at 11:06
  • @YaugenVlasau I am not sure about your last query, if it should profit from an index that index should use the same function that you are comparing against, too; but I am unsure I have to say ;/ – DrCopyPaste Oct 25 '13 at 11:07
  • Using CHARINDEX, the query takes a bit more than 2 minutes to run. I have included the actual execution plans in the original question. – thomasb Oct 25 '13 at 12:54
  • @thomasbtv: when you do the execution plan, can you see any index suggestions Execution Plan window? look for text in green. if you do please apply the scripts into your db and check the effect – Yaugen Vlasau Oct 25 '13 at 13:02
  • No, there is no index suggestion. One thing I forgot : there is an `order by created_at` clause that, when removed in the parameterized query, gets the execution time down to zero (the execution plans I posted have this `order by`). – thomasb Oct 25 '13 at 13:27
  • @thomasbtv did you take a look at how the used index is ordered? – DrCopyPaste Oct 25 '13 at 13:29
  • How do I do that ? (I didn't even know that there was an order) – thomasb Oct 25 '13 at 13:31
  • @thomasbtv You can just get the Index definition from the SSMS by rightclicking on the index and get the creation script, by default ordering is ascending, but maybe you can see some `ASC` or `DESC` keywords after some columns mentioned in the index http://technet.microsoft.com/en-us/library/ms181154%28v=sql.105%29.aspx – DrCopyPaste Oct 25 '13 at 13:36
  • The index on the user's email is ASC. How would that change anything ? – thomasb Oct 25 '13 at 14:04
  • @thomasbtv in your query you order by `created_at DESC` but I suppose there is no ordered index on `created_`? order of other columns does not matter, what does matter are indexes on the column you are ordering by – DrCopyPaste Oct 25 '13 at 14:10
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/40003/discussion-between-thomasbtv-and-drcopypaste) – thomasb Oct 25 '13 at 14:28
0
CREATE INDEX index_name ON myview (A);
CREATE INDEX index_name ON myview (B);
CREATE INDEX index_name ON myview (C);

declare @a nvarchar(200)
set @a = '%testvalue%'
SELECT TOP 100 A, B, C FROM myview WHERE A LIKE @a
vhadalgi
  • 7,027
  • 6
  • 38
  • 67
0

What happens if you try:

set @a = 'test'
select top 100 A, B, C 
  from myview 
 where A like '%' + @a + '%'

I've tried a test on some dummy data and it looks like it may be faster.

Sean
  • 1,416
  • 19
  • 51
  • Actually that's what the query looked like before. It's exactly the same execution time as setting `@a = '%test%'`. The view joins 2 tables of several millions rows, and a 3 tables with a few dozen rows at most. – thomasb Oct 25 '13 at 10:51
0

The estimated execution plan for the parameterized version is clearly not right. I don't believe I've seen a query with 100% estimated cost twice! As the cost is supposed to total 100%. It's also interesting that it believes it needs to start with orders when you're clearly filtering by something on the user table.

I'd rebuild your statistics on all of the tables that are referenced in the view.

update statistics <tablename> with resample

Do one of these for each table involved.

You can attempt running the sql directly (copy paste view body into sql) both parameterized and not to see if it's the view sql is having issues with.

At the end of the day even when you get this fixed it's really only a stop gap. You have 3million users and every time you run the query sql has to go through all 3million records (the 75% scan in your top query) to find all the possible records. The more users you get the slower the query gets. Non-fulltext indexes can't be used for a like query with wildcards at the front.

In this case you can think about a sql index like a book index. Can you use a book index with "part" of a word to find anything quickly? Nope, you've got to scan the whole index to figure out all the possibilities.

You should really consider a full text index on your view.

greyalien007
  • 510
  • 4
  • 13
  • Thank you for your suggestions. However, how do you explain the difference between parameterized and non-parameterized ? – thomasb Oct 28 '13 at 08:29
  • You're judging performance by execution time. You need to make sure you also give a look at what the query costs. I would imagine update statistics should bring them inline. If not, it's just one of those things that makes the people writing the sql internal code smarter than me : ) – greyalien007 Oct 28 '13 at 14:43
  • I have updated statistics and it hasn't changed a thing. You suggest using full-text indexes, but the non-parameterized query doesn't seems to be needing them, so what do I gain by adding some, in addition to more maintenance ? I have chosen to use the non-parameterized method, which is exactly zero maintenance, low risk (for my use case), and high performance. **Edit** and yes, I'm judging performance by execution time, because what else matters to the end user ? – thomasb Oct 28 '13 at 15:19
  • As your user base increases and/or your record count increases the execution time will slow because a like %texttosearch% search is not cheap. Just for fun, create a full-text index and check the cost difference. I believe you'll also notice it's quicker. – greyalien007 Oct 28 '13 at 23:50
  • Yes, but right now, a '%search%' is cheap while a '%' + @search + '%' is not. This is the problem I have. The website is being remade in rails/mongodb so I don't want to introduce additional maintenance in our processes for a non-existant gain (because it works using a dirty method), for just a few months. But if it weren't the case, I would have used full-text indexes. Thanks ! – thomasb Oct 29 '13 at 08:23