Imagine you have millions of users those who perform transactions on your platform. Assuming each transaction is a document in your MongoDB collection there would be millions of documents generated everyday thus exploding your database in no time. I have received the following solutions from friends and family.
- Having TTL index on the document - This won't work because we need those document stored somewhere so that it can be retrieved at a later point in time when the user demands for it.
- Sharding the collection with timestamp as the key - This won't help us control the time frame we want the data to be backed up.
I would like to understand and implement a strategy somewhat similar to what banks follow. They keep your transactions upto a certain point (eg: 6 months
) after which you have to request them via support or any other channel. I am assuming they follow a Hot/Cold storage pattern but I am not completely sure about it.
The entire point is to manage transaction documents and on a daily basis back up or move the older records to another place where it can be read from. Any idea how that is possible with MongoDB?
Update: Sample Document (Please note there are few other keys from the document that have been redacted)
{
"_id" : ObjectId("5d2c92d547d273c1329b49f0"),
"transactionType" : "type_3",
"transactionTimestamp" : ISODate("2019-07-15T14:51:54.444Z"),
"transactionValue" : 0.2,
"userId" : ObjectId("5d2c92f947d273c1329b49f1")
}