-1

Below is my collection, can you please update me how to write a code to delete duplicate records from below collection, having duplicate records in collection, please explain me how to delete the duplicate records from collection.

   /* 1 */
{
    "_id" : ObjectId("5e84200bdf949c00404ed5ff"),
    "area" : "573",
    "bc" : "GER",
    "bd" : "52001450",
    "bg" : "52001450",
    "borg" : "cde5642",
    "bsg" : "51585929",
    "bsgname" : "INFO TECHNOLOGY",
    "consulting" : null,
    "mobilePhoneNumber" : null,
    "cfax" : null,
    "l" : "BERL",
    "cpgr" : null,
    "o" : "S",
    "friendlyCountryName" : "Germ",
    "ctel" : "+49",
    "mail" : "tl2625@ge.at.com",
    "exch" : "204",
    "ext" : "5408",
    "facsimileTelephoneNumber" : null,
    "givenName" : "POMAS",
    "employeeNumber" : "0249527",
    "jt" : "MC",
    "jtname" : "FLEX FORCE ENGINEER IV",
    "sn" : "LEMP",
  
}

/* 2*/

{
    "_id" : ObjectId("5e84200bdf949c00404ed601"),
    "area" : "573",
    "bc" : "GER",
    "bd" : "52001450",
    "bg" : "52001450",
    "borg" : "cde5642",
    "bsg" : "51585929",
    "bsgname" : "INFO TECHNOLOGY",
    "consulting" : null,
    "mobilePhoneNumber" : null,
    "cfax" : null,
    "l" : "BERL",
    "cpgr" : null,
    "o" : "S",
    "friendlyCountryName" : "Germ",
    "ctel" : "+49",
    "mail" : "tl2625@ge.at.com",
    "exch" : "204",
    "ext" : "5408",
    "facsimileTelephoneNumber" : null,
    "givenName" : "POMAS",
    "employeeNumber" : "0249527",
    "jt" : "MC",
    "jtname" : "FLEX FORCE ENGINEER IV",
    "sn" : "LEMP",
prat_86
  • 91
  • 2
  • 10

1 Answers1

0

by next piece of code you can find the duplicated records. (hint: change ..., with other fields.)

var cursor = db.collection.aggregate(
[
        {
            "$group": {
                '_id': {
                    "area":"$area",
                    ..., # fill here by other fields
                    "sn": "$sn"
            },
            "count": {
                    "$sum": 1
                }
            },
            "assets":{
                  "$push": {"assets_id":"$_id"}
        },
        {
            "$match": {
                "count": {
                    "$gt": 1
                }
            }

        }, 
    {
            "$project": {
                "assets": "assets"
            }

        }
    }

]
)

then delete result by filtering _id:

cursor.forEach(function (doc){
...     db.collection.remove({"_id": doc.assets[0].assets_id});
... });

now you just need fill ... with other your fields, like as area and sn which I filled.

Maryam
  • 660
  • 6
  • 19
  • Thanks experts, can you please update me on the query how needs to be change. – prat_86 Jul 06 '20 at 13:14
  • I edit the answer if you have a question about the `...` comment again. – Maryam Jul 06 '20 at 13:45
  • thanks experts, if my collection is like: db.getCollection('employeerecords').find({}) how can i input above query...also i wnt to create new collection, add duplicate records and then need to delete for testing. – prat_86 Jul 07 '20 at 11:21
  • `aggregate` method is more complicated than `find,` `find` just find records according to some filters, but aggregation has multi -stage at each step(`find` has one step) you create semi table with new fields, and at the end, you can have attributed fields by `project`! so I suggest you run above command by replacing the name of your collection, and then remove the result Ids. For creation `db.getCollection('employeerecords').insert({....})` fill `...` with arbitrary field and repeat it, then it insert multi times. – Maryam Jul 07 '20 at 11:41
  • But duplicate records we can enter into the collections i mean columns and values, but object ID will be unique rite. – prat_86 Jul 09 '20 at 06:49