333

I am playing around with MongoDB trying to figure out how to do a simple

SELECT province, COUNT(*) FROM contest GROUP BY province

But I can't seem to figure it out using the aggregate function. I can do it using some really weird group syntax

db.user.group({
    "key": {
        "province": true
    },
    "initial": {
        "count": 0
    },
    "reduce": function(obj, prev) {
        if (true != null) if (true instanceof Array) prev.count += true.length;
        else prev.count++;
    }
});

But is there an easier/faster way using the aggregate function?

M. Justin
  • 14,487
  • 7
  • 91
  • 130
Steven
  • 13,250
  • 33
  • 95
  • 147

9 Answers9

532

This would be the easier way to do it using aggregate:

db.contest.aggregate([
    {"$group" : {_id:"$province", count:{$sum:1}}}
])
Steven
  • 13,250
  • 33
  • 95
  • 147
Anand Jayabalan
  • 12,294
  • 5
  • 41
  • 52
158

I need some extra operation based on the result of aggregate function. Finally I've found some solution for aggregate function and the operation based on the result in MongoDB. I've a collection Request with field request, source, status, requestDate.

Single Field Group By & Count:

db.Request.aggregate([
    {"$group" : {_id:"$source", count:{$sum:1}}}
])

Multiple Fields Group By & Count:

db.Request.aggregate([
    {"$group" : {_id:{source:"$source",status:"$status"}, count:{$sum:1}}}
])

Multiple Fields Group By & Count with Sort using Field:

db.Request.aggregate([
    {"$group" : {_id:{source:"$source",status:"$status"}, count:{$sum:1}}},
    {$sort:{"_id.source":1}}
])

Multiple Fields Group By & Count with Sort using Count:

db.Request.aggregate([
    {"$group" : {_id:{source:"$source",status:"$status"}, count:{$sum:1}}},
    {$sort:{"count":-1}}
])
csharpbd
  • 3,786
  • 4
  • 23
  • 32
  • can you explain {_id:{source:"$source",status:"$status"} this? – Hridoy_089 Jan 18 '21 at 06:19
  • 1
    Basically, the field "_id" is a unique identifier for each document. The field accepts an expression. You can define the value of the field by combining multiple fields based on your grouping criteria. You will find more details about the field in the link: https://docs.mongodb.com/manual/reference/operator/aggregation/group/#pipe._S_group – csharpbd Jan 18 '21 at 13:20
72

If you need multiple columns to group by, follow this model. Here I am conducting a count by status and type:

  db.BusinessProcess.aggregate({
    "$group": {
        _id: {
            status: "$status",
            type: "$type"
        },
        count: {
            $sum: 1
        }
    }
   })
EugenSunic
  • 13,162
  • 13
  • 64
  • 86
KitkatJohn
  • 729
  • 5
  • 2
  • 3
    _id represents a default param for encapsulating multiple fields? – EugenSunic Dec 09 '18 at 15:14
  • @RoyiNamir, please take a look at the link. You may find your information there. https://docs.mongodb.com/manual/reference/operator/aggregation/group/#pipe._S_group – csharpbd Aug 19 '20 at 19:42
52

Starting in MongoDB 3.4, you can use the $sortByCount aggregation.

Groups incoming documents based on the value of a specified expression, then computes the count of documents in each distinct group.

https://docs.mongodb.com/manual/reference/operator/aggregation/sortByCount/

For example:

db.contest.aggregate([
    { $sortByCount: "$province" }
]);
MattM
  • 1,159
  • 1
  • 13
  • 25
  • 4
    Probably worth noting here that `$sortByCount` is actually a "pseudo operator" like several more aggregation stage operators introduced from MongoDB 3.4. All they really do is **expand** into their respective aggregation stages. In this case a `$group` with `$sum: 1` as shown in existing answers and an additional `$sort` stage. They offer no advantage other than *"typing less code"*, which may or may not be more descriptive ( if you're into that sort of thing ). IMHO, distinct `$group` and `$sort` stages in the code are far more descriptive and indeed more flexible. – Neil Lunn May 17 '19 at 06:51
26

Additionally if you need to restrict the grouping you can use:

db.events.aggregate( 
    {$match: {province: "ON"}},
    {$group: {_id: "$date", number: {$sum: 1}}}  
)
andre
  • 1,084
  • 11
  • 13
10

This type of query worked for me:

 db.events.aggregate({$group: {_id : "$date", number:  { $sum : 1} }} )

See http://docs.mongodb.org/manual/tutorial/aggregation-with-user-preference-data/

prule
  • 2,536
  • 31
  • 32
6

Starting in Mongo 5.0, we can also use { $count: { } } as an alias for { $sum : 1 }:

// { "province" : "Champagne-Ardenne" }
// { "province" : "Champagne-Ardenne" }
// { "province" : "Haute-Normandie"   }
db.collection.aggregate([
  { $group: { _id: "$province", count: { $count: {} } } }
])
// { "_id" : "Champagne-Ardenne", "count" : 2 }
// { "_id" : "Haute-Normandie",   "count" : 1 }
Xavier Guihot
  • 54,987
  • 21
  • 291
  • 190
5
    db.contest.aggregate([
        { $match:{.....May be some match criteria...}},
        { $project: {"province":1,_id:0}},
        { $sortByCount: "$province" }
    ],{allowDiskUse:true});

MongoDB have 32 MB limitation of sorting operation on memory, use allowDiskUse : true this option, when you expose this query upfront of millions of data, it will sort at disk level not in memory. MongoDB aggregation pipeline has 100MB limitation, so use $project to reduce the data flowing to next pipeline. If you are using small data then no need to use allowDiskUse option.

Kaushik Das
  • 405
  • 1
  • 5
  • 12
-1

Mongo shell command that worked for me:

db.getCollection(<collection_name>).aggregate([{"$match": {'<key>': '<value to match>'}}, {"$group": {'_id': {'<group_by_attribute>': "$group_by_attribute"}}}])
Riha
  • 9
  • 1