I have a existing collection of ~12 million documents. I want to update one field in all the documents to have a running number within all groups of documents that share a common "ref" field. This would be a one time operation. Is there any way I can achieve this in MongoDB 4.4?
Simplified documents example:
{"_id": 1, "ref": "REF_A", "description": "aaaa"}
{"_id": 2, "ref": "REF_A", "description": "bbbb"}
{"_id": 3, "ref": "REF_A", "description": "cccc"}
{"_id": 4, "ref": "REF_B", "description": "dddd"}
{"_id": 5, "ref": "REF_B", "description": "eeee"}
...
Desired modified output:
{"_id": 1, "ref": "REF_A", "description": "aaaa1"}
{"_id": 2, "ref": "REF_A", "description": "bbbb2"}
{"_id": 3, "ref": "REF_A", "description": "cccc3"}
{"_id": 4, "ref": "REF_B", "description": "dddd1"} <- reset count because ref changed
{"_id": 5, "ref": "REF_B", "description": "eeee2"}
...
The running number is concatenated to description field here. As soon as the "ref" changes, the concat number counter should reset and start from 1 again. When sorted by "_id" all the same refs are already together. Order matters.
I've been looking at aggregations to solve this, but it seems I would need a way to refer to previous documents and could not figure it out yet.
The best I could find is this thread: Add some kind of row number to a mongodb aggregate command / pipeline But does not seem to suit my case where row number is being reset under a condition.