1

I have a mongodb database with many users and one of the subdocuments I track is file uploads and their statuses through a review process. Every file upload will have an attachment status eventually. I want to be able to pull some metrics to get the total of the current statuses for each uploaded file. I started building an aggregate query that pulls the latest attachment subdocument status from each file uploaded and count them.

The data structure is as follows:

 "userName": "johnDoe",
 "email": "johnDoe@gmail.com",
 "uploads" : [
                {
                        "_id" : ObjectId("adh12451e0012ce9da0"),
                        "fileName" : "TestDoc.txt",
                        "fileType" : "text/plain",
                        "created" : ISODate("2021-01-06T15:26:14.166Z"),
                        "attachmentStatus" : [ ]
                },
                {
                        "_id" : ObjectId("5ff5d6c066cacc0012ed655a"),
                        "fileName" : "testerABC.txt",
                        "fileType" : "text/plain",
                        "created" : ISODate("2021-01-06T15:26:56.027Z"),
                        "attachmentStatus" : [
                                {
                                        "_id" : ObjectId("60884f733f88bd00129b9ad4"),
                                        "status" : "Uploaded",
                                        "date" : ISODate("2021-04-22T02:23:00Z")
                                },
                                {
                                        "_id" : ObjectId("60884f733f88bd00129b9ad5"),
                                        "status" : "Processing",
                                        "date" : ISODate("2021-04-26T04:54:00Z")
                                }
                        ]
                },
                {
                        "_id" : ObjectId("6075c82a19fdcc0012f81907"),
                        "fileName" : "Test file.docx",
                        "fileType" : "application/word",
                        "created" : ISODate("2021-04-13T16:34:50.955Z"),
                        "attachmentStatus" : [
                                {
                                        "_id" : ObjectId("72844f733f88bd11479b9ad7"),
                                        "status" : "Uploaded",
                                        "date" : ISODate("2021-04-23T03:42:00Z")
                                },
                                {
                                        "_id" : ObjectId("724986d73f88bd00147c9wt8"),
                                        "status" : "Completed",
                                        "date" : ISODate("2021-04-24T01:37:00Z")
                                }
                        ]
                }
        ]

 "userName": "janeDoe",
 "email": "janeDoe@gmail.com",
 "uploads" : [
                {
                        "_id" : ObjectId("ej9784652h0012ce9da0"),
                        "fileName" : "myResume.txt",
                        "fileType" : "text/plain",
                        "created" : ISODate("2021-02-13T12:36:14.166Z"),
                        "attachmentStatus" : [
                                {
                                        "_id" : ObjectId("15dhdf6f88bd00147c9wt8"),
                                        "status" : "Completed",
                                        "date" : ISODate("2021-04-24T01:37:00Z")
                                }
                  ]
                }, 

How can I pull the latest attachment status out for each file uploaded and then summarize the statuses?

I want something like this:

{ "status" : "Completed", "Count" : 2 }
{ "status" : "Processing", "Count" : 1 }
...

I get very close with this Aggregate query, but it will grab each and every status and not just the the single most current Status for each file. (one current status per file).

db.myDB.aggregate([
{
  "$match" : {
    "uploads.attachmentStatus": {
      "$elemMatch": { "status": { "$exists": true } }
      }
    }
},
{ $unwind: "$uploads"},
{ $unwind: "$uploads.attachmentStatus"},
{
  $sortByCount: "$uploads.attachmentStatus.status"
},
{
  $project: {
    _id:0,
    status: "$_id",
    Count: "$count"
  }
}
]).pretty();

Any suggestions?

Tushar Gupta - curioustushar
  • 58,085
  • 24
  • 103
  • 107
meanstack7
  • 11
  • 1

1 Answers1

0

Demo - https://mongoplayground.net/p/zzOR9qhqny0


  • { $sort: { "uploads.attachmentStatus.date": -1 } }, to get the latest 1st

  • { $group: { _id: "$uploads._id", status: { $first: "$uploads.attachmentStatus.status" } } } Group the records by uploads._id and take the top status (which is the latest status after the sort by date).

Query

{ $sort: { "uploads.attachmentStatus.date": -1 } },
{ $group: { _id: "$uploads._id", status: { $first: "$uploads.attachmentStatus.status" } } },

Complete query

db.collection.aggregate([
  { $match: { "uploads.attachmentStatus": { "$elemMatch": {  "status": { "$exists": true } } } } },
  { $unwind: "$uploads" },
  { $unwind: "$uploads.attachmentStatus" },
  { $sort: { "uploads.attachmentStatus.date": -1 } },
  { $group: { _id: "$uploads._id", status: { $first: "$uploads.attachmentStatus.status" } } },
  { $sortByCount: "$status" },
  { $project: { _id: 0, status: "$_id", Count: "$count" } }
])
Tushar Gupta - curioustushar
  • 58,085
  • 24
  • 103
  • 107