0

Let me explain the problem. I use node-mongodb-native as mongodb driver and every time I need to make find query by _id field I have to convert it to ObjectId like the following:

var ObjectID = require('mongodb').ObjectID;

db.collection.find({_id: new ObjectID('51b02413453078800a000001')}, 
       function (err, docs) {
           ...
       });

I don't want to cast to ObjectID for every request. The single solution I've found so far is generating custom ObjectID as string like the following:

var CustomPKFactory = {
   createPk: function() {
    return new ObjectID().toString();
   }
};

var mongoClient = new MongoClient(new Server('localhost', 27017), {   
   pk: CustomPKFactory,
});

In this case I will have _id as string and I don't need to convert it to ObjectID respectively. But I don't know how it will impact to query performance.

May you tell me which is advantages and disadvantages in this approach?

BenMorel
  • 34,448
  • 50
  • 182
  • 322
Erik
  • 14,060
  • 49
  • 132
  • 218
  • String will use more space in an index, also I can imagine it would be less peformant, there isn't really many advantages except you can query without wrapping everything in `ObjectId` – Sammaye Jun 06 '13 at 07:13
  • Ok. Is it possible to prevent casting to ObjectId in each query? – Erik Jun 06 '13 at 07:50
  • Hmm, maybe you could create an extended `model` class which in its find functions, discovers if you are searching by `_id` and if you are casts it to ObjectId – Sammaye Jun 06 '13 at 07:57
  • I find it's not so handly because I can find by author or any fields that have ObjectID type. – Erik Jun 06 '13 at 08:17
  • Then there is no real easy way since only you know what fields should be a ObjectId, the only other way is to extend my previous comment to include a list of fields stored in the model which will ber translated into ObjectIds when used in a query, but to be honest that would be overkill – Sammaye Jun 06 '13 at 08:26
  • I thought that node-mongodb-native supports auto converting from string id into ObjectId :( – Erik Jun 06 '13 at 08:28
  • I am unsure how if it doesn't know the context of the string...computers are not that human yet, unless it will just assume all 24 characters instances of a string are instantly ObjectIds – Sammaye Jun 06 '13 at 08:38

1 Answers1

0

By default the size of the string will be bigger as Sammaye described on comments. To formalize it:

Object.bsonsize({ "_id" : ObjectId("51b10b55f202d3fee925d637")}) = 22 
Object.bsonsize({ "_id" : "51b10b55f202d3fee925d637"}) = 39
Object.bsonsize({ "_id" : "aaaaaaa"}) = 22
Object.bsonsize({ "_id" : 9999999999999998 }) = 18

So a 7 char long string has the same size as the ObjectId. If you use a number is smaller but you have to consider this:

What i found out which is really interesting while the typing is automatic in mongoshell that the conversion between the type of numbers is automatic. So basicaly the biggest number what you can store as an "integer" (at least the format) is 9999999999999998 which is strange a bit, while it should not relate to a decimal reprezentation (in fact the BSON datatype is a Double). All the numbers above are converted and rounded automatically to normal form for example :

{_id:9999999999999999} 

Will stored as : 1e+16.0 and it a rounded value so when you try to insert :

insert({_id:10000000000000001})
E11000 duplicate key error index: $_id_  dup key: { : 1e+16.0 }

I am thinking about to submit a bug.

The situation even worth with the NumberLong() type which is the 64-bit integer BSON type:

> db.m.insert({_id: NumberLong(10000000000000001)})
E11000 duplicate key error index: t.m.$_id_  dup key: { : 10000000000000000 }
> db.m.insert({_id: NumberLong(10000000000000002)})
> db.m.insert({_id: NumberLong(10000000000000003)})
> db.m.insert({_id: NumberLong(10000000000000004)})
E11000 duplicate key error index: t.m.$_id_  dup key: { : 10000000000000004 }
> db.m.insert({_id: NumberLong(10000000000000005)})
E11000 duplicate key error index: t.m.$_id_  dup key: { : 10000000000000004 }
> db.m.insert({_id: NumberLong(10000000000000006)})
> db.m.insert({_id: NumberLong(10000000000000007)})
> db.m.insert({_id: NumberLong(10000000000000008)})
E11000 duplicate key error index: t.m.$_id_  dup key: { : 10000000000000008 }
> db.m.insert({_id: NumberLong(10000000000000009)})
E11000 duplicate key error index: t.m.$_id_  dup key: { : 10000000000000008 }

So you can use numbers which are smaller in storage size than the ObjectId but be careful.

attish
  • 3,090
  • 16
  • 21
  • It is a bit harder i would do with an aplicable Unix Timestamp and a sequence behind. Based on the load that you need to serve. With several application client also the unique id of the client has to be included. With this you end up with the same logic as mongo does it internally.So i am not sure if it will be faster. – attish Jun 07 '13 at 13:06
  • I think this might be concerned with js more than MongoDB itself – Sammaye Jun 07 '13 at 19:36