How to optimize a select top N Query

Question

I have a very large table, consisting of 40 million rows, in a SQL Server 2008 Database.

CREATE TABLE [dbo].[myTable](
    [ID] [bigint] NOT NULL,
    [CONTRACT_NUMBER] [varchar](50) NULL,
    [CUSTOMER_NAME] [varchar](200) NULL,
    [INVOICE_NUMBER] [varchar](50) NULL,
    [AGENCY] [varchar](50) NULL,
    [AMOUNT] [varchar](50) NULL,
    [INVOICE_MONTH] [int] NULL,
    [INVOICE_YEAR] [int] NULL,
    [Unique_ID] [bigint] NULL,
    [bar_code] [varchar](50) NOT NULL,
 CONSTRAINT [PK_MyTable] PRIMARY KEY CLUSTERED 
(
    [ID] ASC,
    [bar_code] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

I am trying to optimize performance for the following query:

SELECT top 35  ID,
            CONTRACT_NR,
            CUSTOMER_NAME,
            INVOICE_NUMBER,
            AMOUNT,
            AGENCY,
            CONTRACT_NUMBER,
            ISNULL([INVOICE_MONTH], 1) as [INVOICE_MONTH],
            ISNULL([INVOICE_YEAR], 1) as [INVOICE_YEAR],
            bar_code, 
            Unique_ID
            from MyTable 
WHERE 
CONTRACT_NUMBER like @CONTRACT_NUMBER and
INVOICE_NUMBER like @INVOICE_NUMBER and 
CUSTOMER_NAME like @CUSTOMER_NAME 
ORDER BY Unique_ID desc

In order to do that i build an included index on the columns CONTRACT_NUMBER, INVOICE_NUMBER and CUSTOMER_NAME.

CREATE NONCLUSTERED INDEX [ix_search_columns_without_uniqueid] ON [dbo].[MyTable] 
(
    [CONTRACT_NUMBER] ASC,
    [CUSTOMER_NAME] ASC,
    [INVOICE_NUMBER] ASC
)
INCLUDE ( [ID],
[AGENCY],
[AMOUNT],
[INVOICE_MONTH],
[INVOICE_YEAR],
[Unique_ID],
[Contract_nr],
[bar_code]) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]

Still the query is taking from 3 sec to 10 sec to execute. From the query execution plan i see that an index seek operation is taking place consuming about 30% of the total workload and than a Sort (Top N) operation which is consuming the other 70%. Any idea how can i optimize this query, a response time of less than 1 sec is preferred? Note: I tried also to include dhe column [Unique_ID] in the index columns. In this case the query execution plan is doing an index scan, but with many users querying the database, i am having the same problem.

Create an index on the column you wish to sort. However, the fastest possible is to have a 'clustered index' on the column you want to sort by. As this is not possible, try the simple index. — alzaimar, Oct 09 '13 at 09:29
In addition to creating indexes replace `LIKE` clause with FulltextSearch. — GriGrim, Oct 09 '13 at 09:32

score 1 · Answer 1 · answered Jul 04 '16 at 04:56

Check this page for more detail.

Update the statistic with a full scan to make the optimizer work easier.

UPDATE STATISTICS tablename WITH fullscan GO

Set statistics time on and execute the following query

SET STATISTICS time ON GO SELECT num_of_reads, num_of_bytes_read, num_of_writes, num_of_bytes_written FROM sys.dm_io_virtual_file_stats(DB_ID('tempdb'), 1) GO SELECT TOP 100 c1, c2,c3 FROM yourtablename WHERE c1<30000 ORDER BY c2 GO SELECT num_of_reads, num_of_bytes_read, num_of_writes, num_of_bytes_written FROM sys.dm_io_virtual_file_stats(DB_ID('tempdb'), 1) GO

Result

CPU time = 124 ms,  elapsed time = 91 ms
Before Query execution 
num_of_reads         num_of_bytes_read    num_of_writes     num_of_bytes_written
-------------------- -------------------- -------------------- --------------------
725864               46824931328          793589               51814416384
After Query execution  
num_of_reads         num_of_bytes_read    num_of_writes        num_of_bytes_written
-------------------- -------------------- -------------------- --------------------
725864               46824931328          793589               51814416384

Source : https://www.mssqltips.com/sqlservertip/2053/trick-to-optimize-top-clause-in-sql-server/

davek · Answer 2 · 2013-10-09T10:12:27.510

0

Try and replace your clustered index (currently on two columns) with one solely on unique_id (assuming that it really is unique). This will aid your sorting. Then add a second covering index - as you have tried - on the three columns used in the WHERE. Check your statistics are upto date. Ihave a feeling that the column bar_code in your PK is preventing your sort from running as quickly as it could.

Do your variables contain wildcards?If they do,and they are leading wildcards, the index on the WHERE columns cannot be used. If they are not wildcarded, try a direct "=", assuming case-sensitivity is not an issue.

UPDATE: since you have leading wildcards, you will not be able to take advantage of an index on CONTRACT_NUMBER , INVOICE_NUMBER or CUSTOMER_NAME: as GriGrim suggested, the only alternative here is to use fulltext searches (CONTAINS keyword etc.).

edited Oct 09 '13 at 10:12

answered Oct 09 '13 at 09:32

davek

22,499
9
75
95

My Variables contain leading wildcards. I cannot guarantee that column Unique_Id is really unique, and i prefer not to change the current table design. – dragomire zejlko Oct 09 '13 at 09:43
From the query execution plan, it seems that the query is using the index, as there is an index seek operation returning the required rows. As stated above this index seek operation is consuming approximately 30% of the workload with the rest consumed by the SORT (Top N) operation. – dragomire zejlko Oct 09 '13 at 12:11

How to optimize a select top N Query

2 Answers2