20

I wonder why many web sites choose to use random id:s instead of incrementing from 1 on their database tables. I´ve searched without finding any good reasons, are there any?

Also, which is the best method to use? It seems quite inefficient to check if an id already exists before inserting the data, (takes a second query).

Thanks for your help!

  • just to avoid any security threats that based on incremental value sets – swapnesh Jun 20 '12 at 12:38
  • 1
    Incrementing ids leaks information about transaction rates if these are exposed to the client. e.g. number of new users that register each day. – Martin Smith Jun 20 '12 at 12:38

3 Answers3

11

Under the hood, it is likely that they are using incremental ids in the database to identify rows, but the value that gets exposed to end users via the URL parameters is often made into a random string to make the sequence of available objects harder to guess.

It is really a matter of security through obscurity. It hinders automated scripts from proceeding through incremental values and attempting attacks via the URL, and it hinders automated scraping of site content.

If youtube, for example, used incremental ids instead of values like v=HSsdaX4s, you could download every by simply starting at v=1 and incrementing that value millions of times.

Michael Berkowski
  • 267,341
  • 46
  • 444
  • 390
  • What's the point of incremental ids? You will use v=HSsdaX4s when getting the row from the table. So is there really a need for incremental ids? – lawls Sep 28 '13 at 06:59
  • 2
    @lawls At large scale, there may be computational savings indexing the integer fields, especially considering they are probably used as foreign key columns in many other related tables, not just the main table. Really though, incremental ids is just the default behavior of many, many web frameworks and ORMs and a native behavior of the RDBMS to generate the auto-increment ids and return them immediately after the `INSERT`. So at small scale you only really gain the convenience of not having to reconfigure your ORM. You still need to write the algorithm to generate your string ids either way. – Michael Berkowski Sep 28 '13 at 12:34
4

Sequential ids do not scale well (they become a synchronization bottle-neck in distributed systems).

Also, you don't need to check if a newly generated random id already exists, you can just assume that it does not (because there are so many of them).

Thilo
  • 257,207
  • 101
  • 511
  • 656
1

Are you sure that the id's are random? or are they encoded? Either way it is for security.

John Kane
  • 4,383
  • 1
  • 24
  • 42
  • I don´t know if they are random, but Stack Overflow is a perfect example. There is probably no question with an id of 1, all questions have id:s of 8 numbers, as far as I´ve seen. –  Jun 20 '12 at 12:42
  • @piers - [How about this one](http://stackoverflow.com/questions/4/when-setting-a-forms-opacity-should-i-use-a-decimal-or-double) or [this one](http://stackoverflow.com/questions/6/why-doesnt-the-percentage-width-child-in-absolutely-positioned-parent-work-in-i) Just sort the questions by date then go to the last page. – Martin Smith Jun 20 '12 at 12:44
  • @piers Users are incremented from 1 as well. [The founders](http://stackoverflow.com/users/1/jeff-atwood) hold [the early numbers](http://stackoverflow.com/users/4/joel-spolsky) – Michael Berkowski Jun 20 '12 at 12:48