1

Been tinkering with cosmos for a few days now. It says the TTL works the same as mongos does but clearly it doesn't. It works fine with an int32. But it disregards the date completely.

For reference I have the index setup on the "_ts" field and have the expireInSeconds set to -1. Then I give each document its own ttl field for however long I want them to stick around.

Since the int system does work, I went ahead and created stuff for turning future dates into seconds, and that works completely fine. So my other question would be, is it okay to use the ttl for seconds that very far away. I may want documents to auto expire after a month. And that's a lot of seconds for it to figure out on the fly. There also may be thousands of documents in this collection at any given time with all different ttls.

Batman
  • 25
  • 2

1 Answers1

1

Although the end result of TTL functionality is same i.e., the data gets deleted atomatically, the way it happens and the required configuration is slightly different in both and the documentation of MongoDB and CosmosDB very clearly explains the steps.

For MongoDB, there is a need for an index ( TTL index on a field that holds values of BSON date type or an array of BSON date) which facilitates this functionality and you have to explicitly create it whereas in CosmosDB there is no such need. CosmosDB internally tracks the last updated time for each collection and then according to the TTL seconds it purges them automatically. Both MongoDB and CosmosDB obviously require the time in second for TTL.

For MongoDB the purge logic that the TTL Purge thread follows is something like this ( note that this is just to illustrate)

//Assuming there is a field called created_at in the collection which is the basis for TTL
// epoch time
var startTime = new Date(1970,0,1);
// end time
var endTime = new Date(Date.now() - expireAfterSeconds*1000);
// remove all collections between start and end. Since this is a range query so for
//efficiency it obviously requires an index - a special one.
db.collection.remove({created_at: { $gt: startTime, $lte: endTime }});

In case of CosmosDB - there is an internal Timestamp property _ts associated automatically with each collection and is updated automatically whenever a collection is updated.This is what is used for the logic of automatic data purge.

Shailendra
  • 8,874
  • 2
  • 28
  • 37
  • Ah I see. So I'm mistaken on timestamp based deletion even for mongo. That makes a little more sense. I appreciate the answer. Then my second question still holds though. Is it okay to use the ttl to expire documents several days or even months after creation, or should these deletions be handled by another task – Batman Jun 05 '21 at 15:24
  • Of course you can use ttl for that. For that you would need to just convert your duration into second. I guess what you are probably asking is whether it is a right decision or not. I think since this a feature which automatically provides you capability to purge data after some time generally this would be recommended rather than writing your own external delete logic triggered by a scheduler. – Shailendra Jun 05 '21 at 18:33
  • If this answers your question - as a good practice on SO please mark the answer as accepted :-) – Shailendra Jun 06 '21 at 05:49
  • Yeah, I've marked your answer. Thank you for the assistance. I suppose what I'm asking is. Do people use this ttl deletion for durations which are weeks or even months old. I want to make sure I'm following best practices with it. And most documentation mentions very short durations. – Batman Jun 06 '21 at 16:15
  • Thanks. Yes the TTL depends upon the problem context at hand - there are possible use-cases where you would like to have a large TTL. One example I can think of is cart in an eCommerce site where you might want to keep the cart items for long duration like one week or month and then remove it automatically. – Shailendra Jun 06 '21 at 18:36