4

I've been implementing DocumentDB during preview into a multi-tenent system. My plan has been to generate a new database under the DocumentDB account for each tenant that signs up. Most of this code is already in place and the testing is going extremely well.

Since DocumentDB has been officially released and the documentation is finalized I read about the 100 database limit per DocumentDB account and it made me stop to rethink my architecture.

I wanted to keep my tenants isolated so that deleting accounts would be easy, and the organization of it would be very clean. Data between tenants do not need to interact so keeping it separated would not be an issue.

My questions:

Since my goal is to make this scale up to tens or even hundreds of thousands of tenants, do I need to consider a different architecture due to DocumentDB limitations and/or cost?

Does this mean I need to shard every 100 accounts across multiple DocumentDB accounts?

According to Microsoft the limit of 100 databases is just a soft limit that can be scaled up upon request, but can it go up to 100,000+ if needed? What if I get more account sign-ups that expected and I hit my limits in production potentially losing clients?

Does this limitation exist as a way to deter developers from partitioning tenants in this fashion for good reason that I should be considering?

INNVTV
  • 3,155
  • 7
  • 37
  • 71
  • Does this blog post answers your question: [Scaling a Multi-Tenant Application with Azure DocumentDB](http://azure.microsoft.com/blog/2014/12/03/scaling-a-multi-tenant-application-with-azure-documentdb-2/) ? – user272735 Apr 24 '15 at 04:10

1 Answers1

4

There's no such thing as a one-size-fits-all answer when it comes to partitioning / sharding tenant data. Generally, how you partition data depends on your application's query patterns as well as the resource requirements per tenant (in terms of both storage and throughput). Just keep in mind that collections are DocumentDB's unit of boundary for transactions and queries.

Check out the blog post as user272735 mentioned in the comment above: Scaling a Multi-Tenant Application with Azure DocumentDB. It's a great read.

If you need to some more 1:1 guidance for your particular scenario or database/collection limits relaxed, feel free to ping me at andrl {at} microsoft.com.

Andrew Liu
  • 8,045
  • 38
  • 47
  • That was a good article and solidifies my plan to isolate tenants by database. What is not clear to me (and many others it seems) are the following: 1. What is the "hard" limit to databases within a single DocumentDB Account? 2.When I hit a soft limit will it automatically increase my capacity or will my create request throw an error? 3. Once I go past 100, 500 1,000 and possibly even 50,000+ databases on a single account, what are the financial implications? Also Andrew, I'll be attending //BUILD/ next week, will you be there? – INNVTV Apr 24 '15 at 14:40
  • Please also note that in the following post on the Azure blog the limit of 100 databases per account is listed as a hard limit as it is not denoted by an asterisk: http://azure.microsoft.com/en-us/documentation/articles/documentdb-limits/ based on this 100 is a hard limit. Is the next step then to partition across accounts? I can see how this can be cost prohibitive.... – INNVTV Apr 24 '15 at 14:46
  • 1
    I took a good long look at my Azure billing since DocumentDB went GA. Looks like EACH collection you have on S1 tier will be billed at about $25 a month. Therefore due to cost I will have to completely rethink my partitioning scenario and maintain documents for multiple accounts within the same collection. If this is true this is really going to hinder the flexibility I thought I was going to have with DocumentDB and may even make me rethink using it. Are my assumptions correct? Any further guidance you can give on the matter? – INNVTV Apr 24 '15 at 16:51
  • 1
    In thinking through updating my scenario: if I need to delete all of a tenants data do I now have to query for all related materials, paginate over the results and delete every document individually? My original architecture made it so easy to just delete by database, but $25 a month per tenant is not going to be possible. Any advise on managing this? – INNVTV Apr 24 '15 at 18:23
  • `1)` You're right, the pricing model is per collection. I'd recommend grouping tenants together in collections to save on cost; you can tag each document with a tenantId. You could do something like `hash(Tenant) % NumCollections` or manually assign a tenant-collection mapping in your application. If you give me some more details, I can help you find a natural partitioning scheme (e.g. geography). `2)` I won't be attending `//BUILD/` this year, but a few of my colleagues are. I can put you in touch with them. `3)` Mind shooting me an e-mail? Looks like we are hitting comment character limit. – Andrew Liu Apr 27 '15 at 20:48
  • Thanks Andrew, I just sent you an email and marked this question as answered. Really appreciate all your guidance! – INNVTV Apr 28 '15 at 05:05
  • @aliuy, I want to give my users the option to choose whether they want to be separate or in the shared database. For the isolated users, how do I deal with the the redundant `TenantID` column? – Shimmy Weitzhandler Jun 15 '15 at 02:08
  • For those tenants, you could simply ignore the TenantId property (or exclude it) :) – Andrew Liu Jun 15 '15 at 20:07