22

i am new to mongodb and stack overflow.

I want to know why on mongodb collection ID is of 24 hex characters? what is importance of that?

ashish bandiwar
  • 378
  • 1
  • 3
  • 12
  • 1
    The official documentation is a good place to start: [ObjectId](http://docs.mongodb.org/manual/reference/object-id/) – Neil Lunn Aug 18 '14 at 04:14
  • 5
    The default unique identifier generated for a primary key (`_id`) is an [ObjectId](http://docs.mongodb.org/manual/reference/object-id/). This is a 12-byte binary value which is often represented as a 24 character hex string. If you have a more suitable unique identifier to use, you can provide your own value for `_id`. The importance of an ObjectId is that unique values can be generated in a distributed system (typically by the client driver). This is similar to [GUIDs](http://en.wikipedia.org/wiki/Globally_unique_identifier), although more compact. – Stennie Aug 18 '14 at 04:16

2 Answers2

26

Why is the default _id a 24 character hex string?

The default unique identifier generated as the primary key (_id) for a MongoDB document is an ObjectId. This is a 12 byte binary value which is often represented as a 24 character hex string, and one of the standard field types supported by the MongoDB BSON specification.

The 12 bytes of an ObjectId are constructed using:

  • a 4 byte value representing the seconds since the Unix epoch
  • a 3 byte machine identifier
  • a 2 byte process id
  • a 3 byte counter (starting with a random value)

What is the importance of an ObjectId?

ObjectIds (or similar identifiers generated according to a GUID formula) allow unique identifiers to be independently generated in a distributed system.

The ability to independently generate a unique ID becomes very important as you scale up to multiple application servers (or perhaps multiple database nodes in a sharded cluster). You do not want to have a central coordination bottleneck like a sequence counter (eg. as you might have for an auto-incrementing primary key), and you will want to insert new documents without risk that a new identifier will turn out to be a duplicate.

An ObjectId is typically generated by your MongoDB client driver, but can also be generated on the MongoDB server if your client driver or application code or haven't already added an _id field.

Do I have to use the default ObjectId?

No. If you have a more suitable unique identifier to use, you can always provide your own value for _id. This can either be a single value or a composite value using multiple fields.

The main constraints on _id values are that they have to be unique for a collection and you cannot update or remove the _id for an existing document.

Stennie
  • 63,885
  • 14
  • 149
  • 175
  • 1
    Is that 4 byte value unsigned? If it's not MongoDB will have to do an overhaul in about 22 years... – Kenny Worden Jul 13 '15 at 03:35
  • 1
    @KennyWorden ObjectIds currently use a signed 32-bit int (i.e. [unixtime](https://en.wikipedia.org/wiki/Unix_time)), so you're correct that the time component will roll over eventually (see also: [What will happen to ObjectIDs in year 2038?](https://groups.google.com/forum/#!topic/mongodb-user/LIu93QgwOaA)). Generated ObjectIds should continue be unique (byte wise) for a while after rollover but certain assumptions (such as ordering by a monotonically increasing time prefix) would no longer hold true. I assume there will be a replacement ObjectId subtype introduced before then :). – Stennie Jul 13 '15 at 05:47
  • I believe the unixtime component was originally included for uniqueness and a rough ordering of generated ObjectIds, and not to embed a timestamp in default `_id`s (although certainly developers have made assumptions about the timestamp aspect since then). There have been several ObjectId variations already, as implemented by different legacy drivers (see the ["subtypes"](http://bsonspec.org/spec.html) in the BSON spec or as written up in [UUID Support in Robomongo](http://robomongo.org/articles/uuids.html)). – Stennie Jul 13 '15 at 05:51
  • 1
    SIGNED? Why? They don't need to track time before the epoch! – Kenny Worden Jul 13 '15 at 06:36
  • @KennyWorden See also: [unix time](https://en.wikipedia.org/wiki/Unix_time): "Unix time is a single signed integer number which increments every second". I presume that "seemed like a good idea at the time" to the Unix kernel devs (similar to the choice of 1-Jan-1970 as the unix epoch). The usage in ObjectId generation is following the established convention for convenience. The overall ObjectId wants to be uniquely generated, but as noted I don't think it necessarily has to embed a timestamp. You're also free to use your own unique identifiers for `_id` rather than the default ObjectId. – Stennie Jul 13 '15 at 06:47
  • Thanks for your answers! I'll definitely look into this some more. :) – Kenny Worden Jul 13 '15 at 06:49
4

Now mongoDB current version is 4.2. ObjectId size is still 12 bytes but consist of 3 parts.

ObjectIds are small, likely unique, fast to generate, and ordered. ObjectId values are 12 bytes in length, consisting of:

  • a 4-byte timestamp value, representing the ObjectId’s creation, measured in seconds since the Unix epoch

  • a 5-byte random value

  • a 3-byte incrementing counter, initialized to a random value

Create ObjectId and get timestamp from it

> x = ObjectId()
ObjectId("5fdedb7c25ab1352eef88f60")
> x.getTimestamp()
ISODate("2020-12-20T05:05:00Z")

Reference

Read MongoDB official doc