10

We use Postgres and prisma for our Next.js app. Previous developers have used cuid for every table on our schema. For some reasons we are restructuring the tables and I was wondering would it be better to use int ids? Would it result in any performance gain?

What are the tradeoffs between using Int autoincrement id vs cuid for Postgres prisma client?

If you start comparing GUID vs Int ids for Postgres, please quote authentic reference proving that cuid is mapped to guid for Postgres.

Jason Aller
  • 3,541
  • 28
  • 38
  • 38
Mayank Kumar Chaudhari
  • 16,027
  • 10
  • 55
  • 122

2 Answers2

11

A sequence generating bigint values will certainly be faster than even the most efficient CUID or GUID algorithm, and the result will need less storage space.

The only good reasons to use something else like a CUID or GUID are

  • you have cryptographic requirements to obscure the creation order (but CUID doesn't do that)

  • you need to generate primary keys outside the database and in a distributed environment

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • CUID should be composed of timestamp+entropy+sequence number, is it not enough to obscure creation order? is there any attack which can exploit through CUID – l2ysho Jan 12 '23 at 13:14
  • 1
    @l2ysho You could look at the timestamp part, that would show the creation order. – Laurenz Albe Jan 12 '23 at 13:22
  • 1
    NOTE: `cuid@2` (https://github.com/paralleldrive/cuid2/#improvements-over-cuid) addresses quite a few flaws. – Dr1Ku Feb 23 '23 at 13:40
  • 2
    What about not wanting to leak competitive intelligence? e.g. number of registered users, etc. – Olivier Lalonde Feb 28 '23 at 01:12
  • @OlivierLalonde Leak to whom? The end user doesn't see artificially generated keys. And anyone who breaks into the database (e.g., with SQL injection) can query that information anyway. Finally, you could define your sequence as `INCREMENT BY 7` to obscure the count. – Laurenz Albe Feb 28 '23 at 06:44
  • @LaurenzAlbe to anyone who uses your app or api, through the ids shown in urls or api responses. PlanetScale has a good article on this (https://planetscale.com/blog/why-we-chose-nanoids-for-planetscales-api). They use regular auto increment ids internally, but also have public ids using NanoIDs for any records that are exposed publicly. – Roman Scher May 06 '23 at 15:07
  • @RomanScher Nobody forces you to expose the artificial primary key through an API. – Laurenz Albe May 07 '23 at 19:13
1

The most important reasons for which CUID may be used, are 1. Obscure the creation order, and 2. to generate primary keys outside the database and in a distributed environment.

Unfortunately, if one is using CUID for security reasons, you end up exposing even more info. As per cuid2,

The original Cuid leaked details about the id, including very limited data from the host environment (via the host fingerprint), and the exact time that the id was created.

Cuid2 solves all these limitations.

Also, as per the discussion here, autoincrement() has substantial performance benefits over using string IDs.

Conclusion

Use cuid2 if you want to support horizontal scaling and/or too concerned about attackers being able to guess your IDs. It produces secure, collision-resistant IDs optimized for horizontal scaling and performance and also obscures the creation order.

But, cuid2 as the official document states, “But not too fast: If you can hash too quickly you can launch parallel attacks to find duplicates or break entropy-hiding. For unique IDs, the fastest runner loses the security race.”.

Thus, if you are not concerned about attackers being able to guess the next/previous IDs, autoincrement() may be the best choice.

Mayank Kumar Chaudhari
  • 16,027
  • 10
  • 55
  • 122