2

I am using Azure CosmosDb as the database of my application.

Let's say that I need to save all the countries and cities and streets in my database. So, I would have an item who looked like this:

{
    country: Brazil,
    size: 1000,
    population: 200000,
    cities: [
        {
            city: Rio
            population: 8000
            streets: [
                {
                    name: A,
                    postalCode: 12345
                },
                {
                    name: B,
                    postalCode: 34567
                }
            ],
            ...

However, as I am talking about all the countries and cities and streets, this becomes a huge item, bigger than the 2Mb allowed by the cosmosDb.

So, what is the correct way to deal with this? Should I separate the cities and streets in different collections? However, using different collections have many drawbacks, since it is not possible to use stored procedure or guarantee the transaction when updating two different collections.

4c74356b41
  • 69,186
  • 6
  • 100
  • 141
Artur Quirino
  • 486
  • 6
  • 21
  • 1
    The issue is that you've introduced an anti-pattern called an "unbounded array" - regardless whether the max doc size is 2MB or 16MB, you'll still run out of space at some point (and at that point, your app is effectively broken). How you refactor this is really going to depend on what your query needs are: Storing separate docs per city, per street, etc. – David Makogon May 21 '19 at 11:59
  • you could "reference" another collection and do a query. it would be quite inefficient but it would work. – 4c74356b41 May 21 '19 at 12:51

2 Answers2

0

you can put these into same collection, just use primary keys to separate them logically (this is not technically needed, its just better). With your data set it, probably, makes sense to partition on city (or, less likely, country). You dont have to have identical documents in the same collection, although they would look pretty much the same.

4c74356b41
  • 69,186
  • 6
  • 100
  • 141
  • that doesnt make sense, create 2 documents, 3 documents, 100000 documents – 4c74356b41 May 21 '19 at 05:21
  • When I had a similar problem, a single group having multiple assets. I created a collection with groupId as PartitionKey and store thousands of record against single groupid. When I need to retrieve all data, I get by groupid – Pankaj Rawat May 21 '19 at 12:19
  • I think you can make country as a Partition key and put each city as a separate document – Pankaj Rawat May 21 '19 at 12:21
0

Can you explain why you need everything in a single giant document? Any why do you need transactions for updates of this data?

A better approach is to use multiple individual documents, all stored in a single collection for easier management.

Use a field in each document to describe what level it's for (country, city, zip) and then store all the necessary information in that document for that level. You can probably use the country as the partition key as it will likely fit within the 10GB/partition limit.

Mani Gandham
  • 7,688
  • 1
  • 51
  • 60