1

Instead of:

{
   A: [user_id_1, user_id_2, etc.]
}

I want to create this schema:

{
   A: {
    user_id_1: true,
    user_id_2: true,
    etc...
  }
}

The reason is because in order to find if user_id_x is $in A, if it is an array, the time complexity is O(N).

However, as I understand it, the time complexity to find a key value pair is either O(1) or O(logN).

If I choose the schema design for MongoDB, will it have the performance improvements described abolve?

  • you can keep the first, and make a multikey index on A, it will be fast, no collection scan only index scan. – Takis Jan 18 '23 at 23:56
  • But will the obj work as expected? – Bear Bile Farming is Torture Jan 19 '23 at 00:04
  • you should be thinking of how to create an index to avoid collection scan and less I/0. searching inside the array once data are in memory is very small cost. keeping data as fields is bad idea, and with the second option, you cant create index also. try the first with multikey index you will be fine i think – Takis Jan 19 '23 at 00:30
  • Indexes and collection scans are not relevant. The query assumes one document. And then queries whether a certain `user_id_x` exists in the field `A` of this one document. – Bear Bile Farming is Torture Jan 19 '23 at 02:09

1 Answers1

2

Yes, if you choose the schema design where the property of the field is an object, the time complexity to find a specific user_id would be O(1) or O(logN), as you mentioned. This is because MongoDB uses a hash table data structure to store objects, which allows for constant time lookups using the key-value pairs. This is a significant improvement over the O(N) time complexity of searching through an array. However, this performance improvement will only be seen if the number of keys is relatively small.

Jahidul
  • 334
  • 1
  • 6
  • Can you explain what you mean by `this performance improvement will only be seen if the number of keys is relatively small.`? Why will it not scale? – Bear Bile Farming is Torture Jan 21 '23 at 07:20
  • In a MongoDB collection, each document has a maximum size limit, which is typically 16MB. If the number of keys in the object becomes very large, it is possible that the size of the document could exceed this limit, which could cause performance issues. Additionally, when the number of keys becomes very large, the hash table that MongoDB uses to store the object could become less efficient, leading to longer lookup times. – Jahidul Jan 21 '23 at 07:35
  • Also, If the number of keys increases significantly, it could lead to a larger memory usage and disk storage. This could cause the performance to degrade when querying, indexing or updating the documents, as the database needs to process more data. So, the performance improvement that you expect when using the object schema design will only be seen if the number of keys is relatively small and stays within an acceptable limit. – Jahidul Jan 21 '23 at 07:35