15

Microsoft Cosmos DB includes DocumentDB API, Table API and others. I have about ~ 10 TB of data and would like to have a fast key-value lookup (very little updating and writing, mostly are reading). Add a link for Microsoft Cosmos DB: https://learn.microsoft.com/en-us/azure/cosmos-db/

  1. So how should I choose between DocumentDB API and Table API?
  2. Or when should I choose DocumentDB API? When should I choose Table API?
  3. Is it a good practice to use DcoumentDB API to store 10 TB of data?
nkhuyu
  • 840
  • 3
  • 9
  • 23

3 Answers3

20

The Azure Cosmos DB Table API was introduced to make Cosmos DB and its advanced indexing, geo-distribution, etc. features available to the Azure Table storage community. The idea is that someone using Azure Table storage who needs more advanced features only offered by Cosmos DB can literally just change their connection string and their existing code will work with Cosmos DB.

But if you are a greenfield customer then I would recommend using SQL API (formerly called Document DB API) which is a super set of Table API. We are constantly investing in providing more advanced features and capabilities to SQL API where as for Table API we are just looking to maintain compatibility with Azure Table storage's API which hasn't changed in many years.

How much data you have doesn't have any affect on what API you choose. They both have the same multi-model infrastructure and can handle the same sizes of data, query loads, distribution, etc.

Yaron Y. Goland
  • 756
  • 6
  • 15
8

So how should I choose between DocumentDB API and Table API?

Choosing between DocumentDB API and Table API will primarily depend on the kind of data that you're going to store. DocumentDB API provides a schema-less JSON database engine with SQL querying capabilities whereas Table API provides a key-value storage database service. Since you mentioned that your data is key-value based, recommended is that you use Table API.

Or when should I choose DocumentDB API? When should I choose Table API?

Same as above.

Is it a good practice to use DcoumentDB API to store 10 TB of data?

Both Document DB API and Table API are designed to store huge amounts of data.

However you may want to look into Azure Table Storage as well. Cosmos DB lets you fine tune the throughput that you need and robust indexing/querying support and that comes at a price. Azure Tables on the other hand comes with fixed throughput and limited indexing/querying support and is extremely cheap compared to Cosmos DB.

You may find this link helpful to explore more about Cosmos DB: https://learn.microsoft.com/en-us/azure/cosmos-db/introduction.

Gaurav Mantri
  • 128,066
  • 12
  • 206
  • 241
  • We're currently using Azure Table Storage. I'm trying to decide to move to Cosmos Table API or DocumentDB API. It seems that there is no sql querying capabilities with Cosmos Table API. We also want to run some queries on the data. – nkhuyu Oct 28 '17 at 06:20
  • You have querying capabilities on Premium Tables (same as Azure Tables + some more like aggregation etc.). – Gaurav Mantri Oct 28 '17 at 06:26
3

Please don't flag this as off-topic.

It might help for you to know in advance: if you are considering the document interface, then in fact there is a case-insensitivity that can affect how DataContract classes (and I believe all others) are transformed to and from Cosmos.

In the linked discussion below, you will see that there is a case insensitivity in Newtonsoft.Json that can have effects on your handling of objects that you pass or get directly from the API. Not that Cosmos has ANY flaws, and in fact it is totally excellent. But with a document API, you might (like me) start to simply pass DataContract objects into Cosmos (which is obviously not wrong, and in fact very much expected from the object API), but there are some serializer and naming strategy handler options that you are probably better of at least being aware of up front.

So just to add a note for you to be aware of this behavior with an object interface. The discussion is here on GitHub:

https://github.com/JamesNK/Newtonsoft.Json/issues/815

Steven Coco
  • 542
  • 6
  • 16
  • I'm a bit confused. Are you saying it is case-INsensitive or it is case-SENSitiVE? I was working on this yesterday and I see that the PartitionKey is actually case-SENSITIVE. I was trying to create documents while passing partition key. – Amogh Natu Oct 23 '19 at 19:26
  • (Sorry I haven't seen this comment in a while ...) The IN-sensitivity comes from this: if you create a `DataMember` --- case SENSITIVE --- named "Id" (capital I), you can still go into the database, but it goes in by default as "id" (lower case), and will NOT deserialize back into the `DataMember` "Id". Sorry if my explanation is loose, it's been a while, but the DB will use the 'key' member named lowercase "id", though typically the C# `DataMember` you will name upper case ... and the round trip breaks: even though it seems to go up transparently, it does not come down that way. – Steven Coco Jan 05 '20 at 22:19
  • I just wanted to add another link to a Cosmos thread about this: https://github.com/Azure/azure-cosmos-dotnet-v2/issues/427 It has more explanation and a test case showing this behavior. – Steven Coco Jan 08 '20 at 02:43