Django MPTTModel Tree structure gets corrupted inside Redis Cache upon concurrent creation and retrieval operations

Question

Usecase:

We've got a case where one of our models Content represents hierarchical data for our organization and for maintaining this hierarchy we've used django-mptt package which is based on the Modified Preorder Tree Transversal algorithm and for making the response time faster for the client, we've introduced the Redis Cache via django-cacheops package for our models.

Img src = http://www.sitepoint.com/hierarchical-data-database-2/

Problem:

As we've used MPTTModel for maintaining the hierarchical structure of our data inside the database, when concurrent CREATE and READ requests are made by the clients we end up with Corrupted data inside our Redis Cache.

Reason:

When we do a create operation on the tree, The MPTTModel Package actually creates the instance of the object inside the database and build/maintains the tree and while the MPTTModel package is building/maintaining the tree at the same time a read request is made to that instance which is then resulting in corrupted data inside the Redis Cache

Note:- our system architecture is based on Microservice Architecture due to which we've got a couple of requests being made by different Microservices.

Has anyone faced such an issue before and what solution can we use to resolve this problem?

Can you run the entire create request in a transaction so that the changes are committed in one go? Setting [ATOMIC_REQUESTS](https://docs.djangoproject.com/en/3.2/topics/db/transactions/#tying-transactions-to-http-requests) in your DB settings might be worth looking at — Iain Shelvington, Nov 19 '21 at 12:21
I understand your proposition of creating atomic requests but due to some restrictions caused by the system architecture I'm not able to implement atomic requests we've got scenarios in which there's a request for creating a parent CONTENT which needs to call another Microservice and that Microservice has to call the same service again to create a Child CONTENT and if I'm using the decorator @transaction.atomic to make the CREATE request atomic then I'm not able to complete this whole cycle. — nick, Nov 19 '21 at 13:14
Is this accurate for the flow of what you describe? 1) There is a request to create parent content 2) The parent content is saved 3) A request is made to another microservice 4) The microservice makes a request to our service to create child content 5) The child content is created and the microservice's request completes 6) Our request to the microservice completes 7) The initial create request then completes — Iain Shelvington, Nov 19 '21 at 13:25
Do you have the option of using an asynchronous task queue like celery? Instead of having your app wait on the result of several back and forth calls you could just schedule a task to call the microservice after the initial create has completed, that way you __could__ have atomic requests and you wouldn't have this crazy chain of calls and waiting — Iain Shelvington, Nov 19 '21 at 13:31
We can consider this and if we go with a queue like celery or kafka then in that case if the request number 3) has a failure then we'll need to create another task to rollback the initial request right? — nick, Nov 19 '21 at 13:34
If the other microservice is unable to create the child content then the parent content has to be cleaned up? That's not happening now in the process you described, the parent is saved to the DB because the request is not atomic, unless you currently clean up afterwards and have a period where the incomplete data is temporarily in the DB? — Iain Shelvington, Nov 19 '21 at 13:36
It is being cleaned up if the Content microservice which makes the call 3) receives a failure response from the other microservice — nick, Nov 19 '21 at 13:38
@IainShelvington I'll share a diagram to make thing a little more clear — nick, Nov 19 '21 at 13:41
This process seems like a nightmare... having an external service make another request during a request and then trying to wrap that into a single mutation. Why does the other microservice call you and why does it need to be called to create the child? Could that microservice not respond in a format that you could use to create the child as this would make things a lot simpler? — Iain Shelvington, Nov 19 '21 at 13:44
can you have a look at this diagram: https://lucid.app/lucidchart/e6b4f09f-9c9f-49bd-9383-aab3effb1d6e/edit?viewport_loc=-124%2C-164%2C2219%2C1108%2C0_0&invitationId=inv_ed3c7436-aa4b-4a5a-9c53-ac49f6260f80 — nick, Nov 19 '21 at 13:50
When MOD MS makes the call to create child content, what data does it send? Could that data not be sent in the response and handled in the original request instead? — Iain Shelvington, Nov 19 '21 at 13:55
Just one more thing, sorry for the spam. Is there a significant performance impact from not using the cache? — Iain Shelvington, Nov 19 '21 at 13:58
No, actually for the child to exist we need the parent to be existing before the child and before a child is created we need a few things to be done by the MOD MS like uploading relevant files,binaries, zips to the S3 storage and once that process is completed only then a child should be created — nick, Nov 19 '21 at 14:02
No problem @IainShelvington I should be the one Thanking you for being active here and Yes, the cache has a significant impact on the system — nick, Nov 19 '21 at 14:04
The only two options I can see right now are 1) use an asynchronous task queue, atomic requests and some clean up process 2) Have super granular and manual control of your cache and implement some kind of lock so that it's only updated when nothing is being created. Sorry I can't be much more help, this is one of those problems that you can spend weeks on — Iain Shelvington, Nov 19 '21 at 14:10

Django MPTTModel Tree structure gets corrupted inside Redis Cache upon concurrent creation and retrieval operations

Usecase:

Problem:

Reason:

0 Answers0