0

I'm writing a deduplication script in mongo but they return mobile numbers that are equal to null or an empty string, thinking those are all duplicates. I've tried to play around with $ne in mongo but can't quite get it to work. Does anyone know how to return all duplicates that mobile number is not equal to null or an empty string?

    $mobile_duplicates = User::raw(function ($collection) {
        return $collection->aggregate(
            [
                [
                    '$limit' => 200000,
                ],
                [
                    '$group' => [
                        '_id' => [
                            'mobile', //=> '$mobile',
                        ],
                        'uniqueIds' => [
                            '$addToSet' => '$_id',
                        ],
                        'count' => [
                            '$sum' => 1,
                        ],
                    ],
                ],
                [
                    '$match' => [
                        // '_id' => [
                        //    '$ne' => "",
                        // ],
                        // '_id' => [
                        //    '$ne' => null,
                        // ],
                        'count' => [
                            '$gt' => 1,
                        ],
                    ],
                ]
            ],
            [
                'allowDiskUse' => true,
            ]
        );
    });

Thanks in advance!

chloealee
  • 667
  • 1
  • 5
  • 13
  • flag, duplicate instead of answering duplicate, however thanks for the indication, SO are always grateful to OP's trying to close there questions. – Petter Friberg Dec 10 '15 at 09:08

1 Answers1

0

found the answer in this post! stackoverflow.com/questions/14184099/… (to separate the $match queries into two different ones - this worked for me:

$mobile_duplicates = User::raw(function ($collection) {
        return $collection->aggregate(
            [
                [
                    '$match' => [
                        'mobile' => [
                            '$ne' => '',
                            '$exists' => true,
                        ],
                    ],
                ],
                [
                    '$group' => [
                        '_id' => [
                            'mobile' => '$mobile',
                        ],
                        'uniqueIds' => [
                            '$addToSet' => '$_id',
                        ],
                        'count' => [
                            '$sum' => 1,
                        ],
                    ],
                ],
                [
                    '$match' => [
                        'count' => [
                            '$gt' => 1,
                        ],
                    ],
                ],
            ],
            [
                'allowDiskUse' => true,
            ]
        );
    });
chloealee
  • 667
  • 1
  • 5
  • 13
  • Please use the edit link on your question to add additional information. The Post Answer button should be used only for complete answers to the question. - [From Review](/review/low-quality-posts/10504401) – Tamil Selvan C Dec 10 '15 at 04:55
  • this is the complete answer! I am just waiting to mark it with the green button after the time limit. – chloealee Dec 10 '15 at 16:16
  • just ellaborate your answer from the link.because if the link is deleted your answer is not useful to others – Tamil Selvan C Dec 10 '15 at 16:31