0

We want to store ratings of a metric (say sales, profit) for some category (say city) in MondoDB. Example rating scale: [RED, YELLOW, GREEN], the length will be fixed. We are considering the following two document structures:

Structure 1: Ratings as an array
{
    "_id": 1,
    "city": "X",
    "metrics": ["sales", "profit"],
    "ratings" : {
        "sales" : [1, 2, 3],  // frequency of RED, YELLOW, GREEN ratings, fixed length array
        "profit": [4, 5, 6],
    },
}
{
    "_id": 2,
    "city": "X",
    "metrics": ["sales", "profit"],
    "ratings" : {
        "sales" : [1, 2, 3],  // frequency of RED, YELLOW, GREEN ratings, fixed length array
        "profit": [4, 5, 6],
    },
}

Structure 2: Ratings as a map
{
    "_id": 1,
    "city": "X",
    "metrics": ["sales", "profit"],
    "ratings" : {
        "sales" : {             // map will always have "RED", "YELLOW", "GREEN" keys
            "RED": 1,
            "YELLOW": 2,
            "GREEN": 3
        },
        "profit" : {
            "RED":4,
            "YELLOW": 5,
            "GREEN": 6
        },
    },
}
{
    "_id": 2,
    "city": "X",
    "metrics": ["sales", "profit"],
    "ratings" : {
        "sales" : {             // map will always have "RED", "YELLOW", "GREEN" keys
            "RED": 1,
            "YELLOW": 2,
            "GREEN": 3
        },
        "profit" : {
            "RED":4,
            "YELLOW": 5,
            "GREEN": 6
        },
    },
}

Our use case:

  1. aggregate ratings grouped by city and metric
  2. we do not intend to index on the "ratings" field

So for structure 1, to aggregate ratings, I need element-wise aggregations and it seems it will likely involve unwind steps or maybe map-reduce and the resulting document would look something like this:

{
    "city": "X",
    "sales": [2, 4, 6]
    "profit": [8, 10, 12]
}

For structure 2, I think aggregation would be relatively straightforward using the aggregation pipeline, ex (aggregating just sales):

db.getCollection('Collection').aggregate([
    {
        $group: {
            "_id": {"city": "$city" },
            "sales_RED": {$sum: "$ratings.sales.RED"},
            "sales_YELLOW": {$sum: "$ratings.sales.YELLOW"},
            "sales_GREEN": {$sum: "$ratings.sales.GREEN"}
       }
    },
    {
        $project: {"_id": 0, "city": "$_id.city", "sales": ["$sales_RED", "$sales_YELLOW", "$sales_GREEN"]}
    }
])

Would give the following result:

{
    "city": "X",
    "sales": [2, 4, 6]
}

Query: I am tending towards the second structure mainly because I am not clear on how to achieve element-wise array aggregation in MOngoDB. From what I have seen it will probably involve unwinding. The second document structure will have a larger document size because of the repeated field names for the ratings but the aggregation itself is simple. Can you please point out, based on our use case, how would they compare in terms of computational efficiency, and if I am missing any points worth considering?

racerX
  • 930
  • 9
  • 25
  • 1
    Dot notation in MongoDB in principle supports positional array access (`foo.0`), have you tried this? – D. SM Jun 09 '20 at 18:13
  • Hi @D.SM, thanks, I had not tried this, but I looked and found that although dot notation for positional access works in find queries, it does not in the aggregation framework. However, that lead me to try $arrayElemAt which does work for arrays inside the aggregation framework. – racerX Jun 09 '20 at 18:57

1 Answers1

1

I was able to achieve the aggregation with the array structure using $arrayElemAt. (However, this still involves having to specify aggregations for individual array elements, which is the same as the case for document structure 2)

db.getCollection('Collection').aggregate([
    {
        $group: {
            "_id": {"city": "$city" },
            "sales_RED": {$sum: { $arrayElemAt: [ "$ratings.sales", 0] }},
            "sales_YELLOW": {$sum: { $arrayElemAt: [ "$ratings.sales", 1] }},
            "sales_GREEN": {$sum: { $arrayElemAt: [ "$ratings.sales", 2] }},
       }
    },
    {
        $project: {"_id": 0, "city": "$_id.city", "sales": ["$sales_RED", "$sales_YELLOW", "$sales_GREEN"]}
    }
])
racerX
  • 930
  • 9
  • 25