0

As objectID is stored on 12 bytes and are even not well for shard key i ask myself if it's not better to use instead a totally random int64 (8 bytes) for _id ?

my idea, create a totally random int64, see if it's not already present in the collection (mostly not it the pseudo random generator work well), if not then create the document with this _id. so we have _id that use only 8 bytes and work well for shard key

what do you think about it ?

zeus
  • 12,173
  • 9
  • 63
  • 184
  • In my opinion this is a design decision, and I am not sure why you would choose ObjectID to shard. The concept of sharding is spread the the load. So think about which field would make sense to spread the load in your design. – Ely Oct 23 '16 at 19:09

2 Answers2

1

I would strongly recommend you never us a randomly generated number as a unique ID. For one thing, you will always have to check for it's existence when you insert a new record because you can never truly create a unique random number. Another pretty obvious reason is because int64 has a limit.

Use ObjectId for your _id, it is what it is for, or if you have a very good reason not to, use GUIDs

See here:

Considerations when selected a shard key

pieperu
  • 2,662
  • 3
  • 18
  • 31
  • but objectID can't be use as a shardkey, so this mean i will need to have another index in memory (the hashed of objectID maybe) – zeus Oct 23 '16 at 19:01
  • You stamp another field with a randomly generated number and use that as your shard key if you really want, or include non numeric characters and devise a more unique way of generating unique keys. A Shard key does not have to be a single field, you could use a compound shard key. I would not use a random number for your _id. I have updated my answer to include a link for you on shard keys – pieperu Oct 23 '16 at 19:05
  • @loki It _can_ be used as a shard key, it is only a very bad one. – Markus W Mahlberg Oct 23 '16 at 21:48
  • @pieperu ObjectId is per default used for _id, but anything unique is well sufficient and heavily depends on the use case. Some cases even validate e-Mail adresses. Shard keys, however, do not need to be unique. Quite the contrary. Take economic regions, for example (EMEA, APAC, NCSA). If you want your shards close to your customers, it may well be that those three values make an excellent shard key. – Markus W Mahlberg Oct 23 '16 at 21:50
  • I agree, my point is that a random number is not unique, and to make it so will require some effort to check a random number is not already an existing _id. As i stated above, A random alphanumeric string is better than a random number if it is to be unique. Yes I also agree that a shard key does not need to be unique, quite the contrary. Hence why i shared this link in my answer – pieperu Oct 23 '16 at 21:53
1

Excellent solution is take ObjectID() and move first 4 bytes to end of ObjectID. This new ObjectID is absolutely unique and it can be used as shard key because it is NOT linearly increasing type.

JJussi
  • 1,540
  • 12
  • 12