10
begin transaction;
create table person_id(person_id integer primary key);
insert into person_id values(1);
... snip ...
insert into person_id values(50000);
commit;

This code takes about 0.9 seconds on my machine and creates a db file taking up 392K. These numbers become 1.4 seconds and 864K if I change the second line to

create table person_id(person_id integer nonclustered primary key);

Why is this the case?

Edward
  • 3,292
  • 1
  • 27
  • 38
Elite Mx
  • 141
  • 1
  • 1
  • 4

3 Answers3

5

A great answer to this question is available over at the DBA StackExchange: https://dba.stackexchange.com/questions/7741/when-should-a-primary-key-be-declared-non-clustered/7744#7744

Community
  • 1
  • 1
Ryan Kirkman
  • 4,051
  • 2
  • 25
  • 20
2

Clustering the primary key stores it with the rows; this means that it takes up less space (as there are no separate index blocks). Typically its main benefit however, is that range scans can generally access rows which are in the same block, reducing IO operations, which becomes rather important when you have a large data set (not 50k ints).

I think 50k ints is a rather artificial benchmark and not one you care about in the real world.

MarkR
  • 62,604
  • 14
  • 116
  • 151
  • If I didn't plan on doing joins, nor range scans and only cared about insert performance - would there be any better way to create the table than the first examples? – Elite Mx Jan 26 '10 at 09:56
  • If you only cared about insert performance, you should use no indexes at all (if supported), or write the data into a text file. Appending to text files is pretty quick. – MarkR Jan 26 '10 at 21:49
0

[Only as an idea]

Maybe when you specify explicitly to take integer columns as a clustered key, it does just that. But when you tell it not to use your integer column, it still creates an index behind the scenes but chooses a different datatype for doing that, suppose, twice as large. Then each of those entries have to reference the records in the table and here you go, the size is exploding.