12

We have a multitenant application that uses Azure DocumentDB as our NoSQL document oriented database.

For multitenancy, we read this question and this blog post. Because right now our number of users do not meet the need to use different databases and/or documentCollections, and more importantly, for cost savings we implemented multitenancy with a "Where" clause on a TenantId field with one documentCollection.

Similarly, when it comes to storing "documents" or "objects" with complete different natures (say for example Book and Car) we are questioning ourselves on the fact to use one documentCollection.

At first, it looks more reasonable to create two different documentCollection for Book and Car. However, creating a documentCollection costs 25$ minimum. We do not want to pay +25$ everytime we need to add a new feature even if it is to store a low amount of data (e.g. our app stores a lot of Books but few Cars...).

Is it a good design to put Book and Car in the same documentCollection? And keep a reference to the type of the document in a shared member (e.g. string Type ="Book" or string Type = "Car").

Knowing that we have already implemented the Multitenancy with a Where "clause", to query all Cars in our App for a given tenant, our queries would all contain Where TenantId ="XXXX" AND Type = "Car".

I have seen that DocumentDB supports now the Partitioned Collection. Could this be a good usage of the partitions or, on the contrary, they should be kept to achieve better scalability and are not adapted to segregate different document types whose object quantities may not be similar?

Community
  • 1
  • 1
Benoit Patra
  • 4,355
  • 5
  • 30
  • 53
  • Possible duplicate of [Single or Multiple Entities Per Collection in DocumentDB](https://stackoverflow.com/questions/27456564/single-or-multiple-entities-per-collection-in-documentdb) – Michael Freidgeim Mar 14 '18 at 12:58

1 Answers1

13

Yes, it is "good design" to use type="Book". You can also do isBook=true, which I believe is slightly more efficient and enables inheritance and mixin behavior.

Partitioned Collections are actually a way to put more stuff into a single larger entity rather than the other way around. The idea is to allow scaling of both throughput (RUs) and space without the burden of managing multiple Collections yourself. You "could" make your partition key be your type field, but I would not recommend it. Partition keys should enable roughly even spread among partitions... among other criteria.

Larry Maccherone
  • 9,393
  • 3
  • 27
  • 43
  • Thank you for your prompt answer. Just to be sure when you write "Partition keys should enable roughly even spread among partitions... among other criteria." You mean that partition keys are used to create "partition of roughly similar size" ? – Benoit Patra Apr 20 '16 at 17:16
  • We are also investigating your [lumenize](https://github.com/lmaccherone/documentdb-lumenize) library for enabling aggregate functions to DocumentDb. Looks very promising. – Benoit Patra Apr 20 '16 at 17:18
  • Similar size and throughput. Let me know if you need help with Lumenize. I monitor that Stack Overflow tag as well as this one. – Larry Maccherone Apr 20 '16 at 18:55
  • Can you please expand on the statement "`isBook=true`, which I believe is slightly more efficient and enables inheritance and mixin behavior"? – Jacob Apr 18 '18 at 16:14
  • Let's say you have a WorkItem class with subclasses for Defect and Story. When you are rendering a page for the sprint with all the WorkItems, you can search for isWorkItem=true. When you are doing things that only apply to Defects, you can look for isDefect=true. If you just had type="Defect" or type="Story" that could also work. You'd just have to use an `or` clause for Defect or Story, but what about when you add another WorkItem type. You then have to change the code everywhere you wanted all WorkItems. – Larry Maccherone Apr 19 '18 at 23:20
  • so assume we have an extra property to check the document type. So our predicate will be .where(x=>x.tenantId == '' && x.type == 'book'). So is it okay to keep the type property in 'Book' object ?? (then only we can use x.type). – Ranadheer Reddy Oct 04 '18 at 07:16
  • My personal preference would be to do one or the other but sure. The way your are doing it (`x.type == 'book'`) is a lot more common than, `isBook == true`. That's just my preference to enable mixin in inheritance behavior. – Larry Maccherone Oct 05 '18 at 12:39
  • 1
    @LarryMaccherone (sorry for a bit late comment) how about having different `type` property (Defect, Story, whatever..) but have all `WorkItems` under same partitionKey? That way getting all work items is a matter of reading all documents with same partition key and filtering specific types is filtering by `type` property. Of course, partitionKey for `WorkItems` should probably be granular/composite one, including whatever parent makes sense in that context; be it `CreatorId`, `Year`, `ProjectId` or similar. – dee zg Feb 05 '19 at 18:53
  • @deezg, that works for two levels of inheritance. What if you needed 3? – Larry Maccherone Feb 06 '19 at 20:40
  • 1
    @LarryMaccherone i believe something like `ARRAY_CONTAINS(["story", "workitem", "defect"], items.type)` is what would cover for those cases, although i have a hard time imagining why would i need my db documents reflect domain models inheritances 1:1. Please don't get me wrong, i'm not saying anything is wrong with your approach...it just got my attention as something i've never used and i'm trying to check myself if i have a use cases for it. – dee zg Feb 07 '19 at 04:21