1

I have a MongoDB collection with documents like:

{'city': 'NYC', 'value': 'blue'},
{'city': 'NYC', 'value': 'red'},
{'city': 'Boston', 'value': 'blue'},
{'city': 'Boston', 'value': 'green'}

I want to aggregate distinct values of city with a list of distinct values of value, like:

{'city': 'NYC', 'values': ['blue', 'red']},
{'city': 'Boston', 'values': ['blue', 'green']}

How can I do this in a PyMongo pipeline?

Something with a shell like:

cursor = db.aggregate([
        {'$group': {
            '_id': {
                'value': '$value',
                'city': '$city'
            }
        }},
])
OJT
  • 887
  • 1
  • 10
  • 26
  • 2
    look at this [answer](https://stackoverflow.com/a/53836634/8987128), you just need a $group stage like this answer. – turivishal Jun 23 '21 at 03:53

1 Answers1

1

In the _id field of the group, you should specify only the keys you want to be grouped by (city in your case).

Followed by that key, the rest of the keys are additional keys you want from the query result. $addToSet will append each finding of the grouped field to an array without duplicates.

Below is the Aggregation code you are looking for:

cursor = db.aggregate([
  {
    "$group": {
      "_id": "$city",
      "value": {
        "$addToSet": "$value"
      }
    }
  },
])

In the about code, _id consists of grouped city names.

hhharsha36
  • 3,089
  • 2
  • 12
  • 12