How to keep documents with 2 partition keys in sync / referential integrity?

Question

I have a cosmos db with high cardinality synthetic partition keys and type properties. I need a setup where users can share documents between them.

for example, this is a document:

{
“id”:”guid”,
“title”:”Example document to share”,
“ownerUserId”:”user1Guid”,
“type”: “usersDocument”,
“partitionKey”:”user_user1Guid_documents”
}

now, user wants to share this document with another user.

Assumptions:

one document can be shared with many users (thousands)
one user can have thousands of documents shared with him

For these 2 reasons:

i dont want to embed sharings into document documents nor in user documents (since writes would very soon become ineffective/expensive) but i would prefer those m:n be separate documents.
i dont want to put shares for all users/documents as it will create hot spots very soon

I need both queries:

1. ListDocumentsSharedWithMe In this query, at query time, i know id of the user documents are shared with.

2. ListAllUsersISharedThisDocumentWith In this query, at query time, i know ‘idof thedocumentthat has been shared with differentusers`.

All this makes me think i should have 2 separate document types with separate partition

For listing all documents shared with me:

{
  “id”:”documentGuid”,
  “type”:”sharedWithMe”,
  “partitionKey”:”sharedWithMe_myUserGuid”
}

(this could also be a single document with collection of shared documents. important here is partitionKey)

Now i can easily do SQL like SELECT * FROM c WHERE c.type = “sharedWithMe” and run query against partition key containing my user guid.

For listing all users i shared some document with, its similar:

{
  “id”:”userISharedWithGuid”,
  “type”:”documentSharings”,
  “partitionKey”:”documentShare_documentGuid”
}

Now i can easily do SQL like SELECT * FROM c WHERE c.type = “documentSharings” and run query against partition key containing my document guid.

Question:

When user shares a document with some user, both documents should be created with different partition keys (thus, no sp/transactions).

How to keep this “atomic-like” or avoid create/update anomalies?

Or is there any better way to model this?

score 1 · Answer 1 · answered Jul 03 '19 at 18:10

I think your method makes sense I do something similar to partition in multiple ways based on the scope of a query. I assume your main concern is if a failure happens in between saving the first and last set of related documents? The only way unfortunately to manage the chain of documents as they save is within your application code. i.e. we make sure we save in the order that makes it easiest to rollback and then implement a rollback method within the exception handler, this works by keeping a collection saved documents in memory.
As you say as you are across partitions there is no transaction handling out of the box.

How to keep documents with 2 partition keys in sync / referential integrity?

1 Answers1