1

Given the following data in MongoDB:

[ 
  { id: 1, stuff: ["A", "B"] },
  { id: 2, stuff: ["B", "C"] },
  ... (lots and lots of records)
]

Is it possible to get the union of all "stuff" sets? e.g. ["A","B","C"]

I've tried using $addToSet

aggregate([
  { $group: {
      _id: null, 
      allStuff: { $addToSet: "$stuff" }
    }
  }
])

but that creates a set of sets e.g. [ ["A", "B"], ["B", "C"] ]

Salvador Dali
  • 214,103
  • 147
  • 703
  • 753
Constantinos
  • 1,138
  • 9
  • 18
  • Everything is possible. Have you gave any thoughts into it, or you just want other people to work for you? – Salvador Dali Sep 28 '14 at 02:26
  • @SalvadorDali I've tried using `$addToSet`. Edited question to show what I've done so far. Looked through docs but can't find any other reference to set union. – Constantinos Sep 28 '14 at 07:01

1 Answers1

1

Ok, after showing your attempt, here is what you can do:

db.a.aggregate([
  { $unwind : "$stuff" },
  { $group : {
    _id: null,
    all : {$addToSet : "$stuff"}
  }}
])

In the beginning it unwinds all the elements in the arrays that you have and then just tries to add them all to the set.

db.a.insert({ id: 1, stuff: ["A", "B"] })
db.a.insert({ id: 2, stuff: ["B", "C"] })
db.a.insert({ id: 3, stuff: ["A", "B", "C", "D"] })

Gives you: { "_id" : null, "all" : [ "D", "C", "B", "A" ] }

Salvador Dali
  • 214,103
  • 147
  • 703
  • 753
  • Thanks - how expensive is $unwind? I expect to have a few thousand records, and the average size of the "stuff" array would be 10-20. – Constantinos Sep 28 '14 at 07:29
  • @Constantinos sorry, but it is hard for me to give an estimate. I can just tell that few thousand is super small for databases. You can also try to use [foreach](http://stackoverflow.com/a/22890331/1090562) collect all the elements into array and then find all unique elements. But do the measurements to find out which one is better. – Salvador Dali Sep 28 '14 at 07:33