3

In mongodb I have collection where arrays has duplicate entries like

{
    "_id": ObjectId("57cf3cdd5f20a3b0ba009777"),
    "Chat": 6,
    "string": [
        "1348157031 Riyadh",
        " 548275320 Mohammad Sumon",
        " 1348157031 Riyadh",
        " 548275320 Mohammad Sumon",
        " 1348157031 Riyadh",
        " 1348157031 Riyadh"
    ]
}

I need to remove duplicate arrays and keep only unique array values like below.

{
    "_id": ObjectId("57cf3cdd5f20a3b0ba009777"),
    "Chat": 6,
    "string": [
        "1348157031 Riyadh",
        " 548275320 Mohammad Sumon",
    ]
}

what would be the best way to do this

thanks

Sumon
  • 289
  • 8
  • 18
  • Do you want to modify existing document or want to apply this to new documents only? – Nitin Verma Sep 08 '16 at 19:58
  • 1
    Possible duplicate of [How to remove duplicate entries from an array?](http://stackoverflow.com/questions/9862255/how-to-remove-duplicate-entries-from-an-array) – dyouberg Sep 08 '16 at 20:04
  • Yes I do want to modify existing document – Sumon Sep 08 '16 at 20:13
  • I understand this is kind of duplicate but would be grateful if anyone could help – Sumon Sep 08 '16 at 20:20
  • You probably need to do this client side like this (which is listed in the duplicate link) http://stackoverflow.com/questions/8405331/how-to-remove-duplicate-record-in-mongodb-by-mapreduce – dyouberg Sep 08 '16 at 20:27
  • above link try to remove record but I just need to remove duplicate list – Sumon Sep 08 '16 at 20:40

2 Answers2

2
db.getCollection('Test').aggregate([{
    $unwind: '$string'},
    {
        $group: {
            _id: '$_id', 
            string: {
                $addToSet: '$string'
            }, 
            Chat: {
                $first: '$Chat'
            }
        }
    }
    ]);

O/P: here you are getting 2 "1348157031 Riyadh" because there is an extra space which defines itself as an different entity.

{
    "_id" : ObjectId("57cf3cdd5f20a3b0ba009777"),
    "string" : [ 
        " 1348157031 Riyadh", 
        " 548275320 Mohammad Sumon", 
        "1348157031 Riyadh"
    ],
    "Chat" : 6
}
Ehsan
  • 604
  • 7
  • 21
Shantanu Madane
  • 617
  • 5
  • 14
  • Thanks Shantanu, can I use any code to trim space from first element from this array. – Sumon Sep 09 '16 at 09:11
  • If its bymistake then i would suggest you to modify it directly in db, and always trim and save at the client side from the next time – Shantanu Madane Sep 09 '16 at 09:20
  • @ Shantany I am able to aggregate but how to permanently delete duplicate element from db – Sumon Sep 09 '16 at 16:40
  • Follow the link this could help http://stackoverflow.com/questions/18804404/mongodb-unwind-array-using-aggregation-and-remove-duplicates – Shantanu Madane Sep 12 '16 at 08:43
0

Mongo 3.4+ has $addFields aggregation stage, which allows you to avoid explicitly listing all the other fields to keep:

collection.aggregate([
    {"$addFields": {
        "string": {"$setUnion": ["$string", []]}
    }}
])

Just for reference, here is another (more lengthy) way that uses $replaceRoot and also doesn't require listing all possible fields:

collection.aggregate([
    {'$unwind': {
        'path': '$string',
        // output the document even if its list of books is empty
        'preserveNullAndEmptyArrays': true
    }},
    {'$group': {
        '_id': '$_id',
        'string': {'$addToSet': '$string'},
        // arbitrary name that doesn't exist on any document
        '_other_fields': {'$first': '$$ROOT'},
    }},
    {
      // the field, in the resulting document, has the value from the last document merged for the field. (c) docs
      // so the new deduped array value will be used
      '$replaceRoot': {'newRoot': {'$mergeObjects': ['$_other_fields', "$$ROOT"]}}
    },
    {'$project': {'_other_fields': 0}}
])    
Dennis Golomazov
  • 16,269
  • 5
  • 73
  • 81