Why SELECT COUNT(*) performs clustered index scan?

Question

I have the following table:

CREATE TABLE [dbo].[Addr](
    [Address] [char](34) NOT NULL,
 CONSTRAINT [PK_Addr] PRIMARY KEY CLUSTERED 
(
    [Address] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

And I am trying to execute a query:

SELECT COUNT(*)
FROM Addr

When table contains about 8 million records, it was executing immediately. But now table contains 21 million records, and query is executing very slow. Managemet Studio shows the next estimated plan:

Execution plan

Screenshot from Storage tab (Table Properties):

enter image description here

I am using MSSQL 2008 Express 10.50.1617.0. Why this simple query has such complicated plan?

Probably revealing my own ignorance of SQL Server here, but what plan do you get for `select count(Address) from Addr`? (I'll be *shocked* if they're different, just curious.) — T.J. Crowder, Dec 12 '13 at 10:50
How slowly is "very slow"? You may also be encountering blocking from other transactions. Particularly if it never seems to finish at all that may well be the case. — Martin Smith, Dec 12 '13 at 11:01
"very slow" - I am waiting for 3 minutes, and after that terminate a query. There is not other transactions. — Zergatul, Dec 12 '13 at 11:03
3 minutes to read 10GB of data seems fairly slow but not impossible. Have you checked in `SELECT * FROM sys.dm_tran_locks WHERE resource_associated_entity_id = OBJECT_ID('Addr')` to be sure there are no conflicting locks? — Martin Smith, Dec 12 '13 at 11:10
I ran your query, there is no conflicting locks. Why you are saying about 10GB, data space in only 1 GB — Zergatul, Dec 12 '13 at 12:07
@Zergatul - Ah I see. I guess you are in a locale where `,` is the decimal separator not thousands! — Martin Smith, Dec 12 '13 at 13:14
In that case 3 minutes to read 1GB of data does seem painfully slow. BTW What is "Address"? Is it postal address? If so `[char](34)` seems a really odd choice. It will always store 34 characters even if the address is actually much shorter. You could possibly reduce the size of the table and IO significantly if you switch to `varchar`. — Martin Smith, Dec 12 '13 at 14:36
No, that is bitcoin address, that has constant (plus-minus 1) length. — Zergatul, Dec 12 '13 at 20:01

score 5 · Accepted Answer · answered Dec 12 '13 at 10:53

5

This plan is not complicated at all. To count the number of records, the engine has to scan entire table, but since there's a clustered index, it uses it. With no clustered index, thre would be Table Scan instead of Clustered Index Scan. If you had non-clustered index on any column, the optimizer would most probably choose that index to count the records, and the operation would be faster.

answered Dec 12 '13 at 10:53

AdamL

12,421
5
50
74

1

But how to explain, when table contained 8 million records, this query was executing immediately? – Zergatul Dec 12 '13 at 10:56
3

@Zergatul: Perhaps you've crossed the point where the index fits in memory. – T.J. Crowder Dec 12 '13 at 10:57
3

@Zergatul - Probably the pages were all in cache then. It still would have had a CI scan counting 8 million rows. – Martin Smith Dec 12 '13 at 10:58
That is interesting. Is there any ways to check this? – Zergatul Dec 12 '13 at 11:02
2

@Zergatul `SET STATISTICS IO ON` then run your query. I see you are on Express Edition so that is limited to 1GB of data cache and also doesn't use read ahead to bring pages into cache. Also you could check the level of fragmentation on the CI and defrag if needed. Might give some improvement. – Martin Smith Dec 12 '13 at 11:11
1

@Adam, why using non-clustered index for Count operation would be faster than clustered one? – GrzegorzM Aug 29 '18 at 08:38
You mean to tell me that the database doesn't just track the number of rows and actually has to inspect individual pages every time it wants a count? – Triynko Jan 11 '19 at 22:43

Why SELECT COUNT(*) performs clustered index scan?

1 Answers1