69

My collection in mongodb is similar to the following table in SQL:

Sentiments(Company,Sentiment)

Now, I need to execute a query like this:

SELECT
  Company, 
  SUM(CASE WHEN Sentiment >0 THEN Sentiment ELSE 0 END) AS SumPosSenti, 
  SUM(CASE WHEN Sentiment <0 THEN Sentiment ELSE 0 END) AS SumNegSenti
FROM Sentiments
GROUP BY Company

What should I do to write this query in Mongo? I am stuck at the following query:

db.Sentiments.aggregate(
{ $project: {_id:0, Company:1, Sentiment: 1} },
{ $group: {_id: "$Company", SumPosSenti: {$sum: ? }, SumNegSenti: {$sum: ? } } }
);
Pier-Luc Gendreau
  • 13,553
  • 4
  • 58
  • 69
Aafreen Sheikh
  • 4,949
  • 6
  • 33
  • 43
  • 2
    You might be able to use a `$cond` in the sum: http://docs.mongodb.org/manual/reference/aggregation/#conditional-expressions but it sounds like a butchery of speed and scalable querying, I can imagine this query was slow in SQL and will be slow here on a medium result set – Sammaye Dec 31 '12 at 14:18
  • @Sammaye I am trying to replace '?' with {$cond: { Sentiment: { $gte: 0} } } . But that looks like a wrong syntax.. I'm not getting any output. – Aafreen Sheikh Dec 31 '12 at 14:30
  • 2
    $cond works like an if statement exactly like a case, so the first expression would be: `Sentiment >0` and then `Sentiment` and then `0` for the first $cond in the first $sum – Sammaye Dec 31 '12 at 14:31

3 Answers3

84

As Sammaye suggested, you need to use the $cond aggregation projection operator to do this:

db.Sentiments.aggregate(
    { $project: {
        _id: 0,
        Company: 1,
        PosSentiment: {$cond: [{$gt: ['$Sentiment', 0]}, '$Sentiment', 0]},
        NegSentiment: {$cond: [{$lt: ['$Sentiment', 0]}, '$Sentiment', 0]}
    }},
    { $group: {
        _id: "$Company",
        SumPosSentiment: {$sum: '$PosSentiment'},
        SumNegSentiment: {$sum: '$NegSentiment'}
    }});
Mostafiz Rahman
  • 8,169
  • 7
  • 57
  • 74
JohnnyHK
  • 305,182
  • 66
  • 621
  • 471
63

Starting from version 3.4, we can use the $switch operator which allows logical condition processing in the $group stage. Of course we still need to use the $sum accumulator to return the sum.

db.Sentiments.aggregate(
    [
        { "$group": { 
            "_id": "$Company",  
            "SumPosSenti": { 
                "$sum": { 
                    "$switch": { 
                        "branches": [ 
                            { 
                                "case": { "$gt": [ "$Sentiment", 0 ] }, 
                                "then": "$Sentiment"
                            }
                        ], 
                        "default": 0 
                    }
                }
            }, 
            "SumNegSenti": {
                "$sum": { 
                    "$switch": { 
                        "branches": [ 
                            { 
                                "case": { "$lt": [ "$Sentiment", 0 ] }, 
                                "then": "$Sentiment"
                            }
                        ], 
                        "default": 0 
                    } 
                }
            }
        }}
    ]
)

If you have not yet migrated your mongod to 3.4 or newer, then note that the $project stage in this answer is redundant because the $cond operator returns a numeric value which means that you can $group your documents and apply $sum to the $cond expression.

This will improve the performance in your application especially for large collection.

db.Sentiments.aggregate(
    [
        { '$group': {
            '_id': '$Company',
            'PosSentiment': { 
                '$sum': {
                    '$cond': [
                        { '$gt': ['$Sentiment', 0]}, 
                        '$Sentiment', 
                        0
                    ]
                }
            },
            'NegSentiment': { 
                '$sum': {
                    '$cond': [
                        { '$lt': ['$Sentiment', 0]}, 
                        '$Sentiment', 
                        0
                    ]
                }
            }
        }}
    ]
)

Consider a collection Sentiments with the following documents:

{ "Company": "a", "Sentiment" : 2 }
{ "Company": "a", "Sentiment" : 3 }
{ "Company": "a", "Sentiment" : -1 }
{ "Company": "a", "Sentiment" : -5 }

The aggregation query produces:

{ "_id" : "a", "SumPosSenti" : 5, "SumNegSenti" : -6 }
styvane
  • 59,869
  • 19
  • 150
  • 156
  • FYI, this also works for use cases where you want to check a boolean or string instead of a number. Use `$eq` instead for a direct match, and make sure you set your `then` value to the number you want to count by (ex. `then: 1`) – Noah Kreiger Dec 28 '21 at 15:14
9

Explaining the snippets above, that uses the array syntax:

PosSentiment: {$cond: [{$gt: ['$Sentiment', 0]}, '$Sentiment', 0]}

is equal to:

PosSentiment: {$cond: { if: {$gt: ['$Sentiment', 0]}, then: '$Sentiment', else: 0} }

The array syntax summarizes the long syntax to just { $cond: [if, then, else] }

dpolicastro
  • 1,379
  • 1
  • 14
  • 22