1

I'm trying to use the aggregation framework to perform group counts in mongo but the results are not exactly as expected.

Consider the collection bellow

 $people->insert(array("user_id" => "1", "day" => "Monday", 'age' => 18));
 $people->insert(array("user_id" => "3", "day" => "Monday", 'age' => 24));
 $people->insert(array("user_id" => "1", "day" => "Monday", 'age' => 18));
 $people->insert(array("user_id" => "1", "day" => "Monday", 'age' => 18));
 $people->insert(array("user_id" => "2", "day" => "Monday", 'age' => 25));
 $people->insert(array("user_id" => "4", "day" => "Monday", 'age' => 33));
 $people->insert(array("user_id" => "1", "day" => "Tuesday", 'age' => 18));
 $people->insert(array("user_id" => "2", "day" => "Tuesday", 'age' => 25));
 $people->insert(array("user_id" => "1", "day" => "Wednesday", 'age' => 18));
 $people->insert(array("user_id" => "2", "day" => "Thursday", 'age' => 25));
 $people->insert(array("user_id" => "1", "day" => "Friday", 'age' => 18));

I use the query below try count the number of distinct entries (user_id) for each day of the week.

$query = array(
        array(
            '$project' => array(
                'user_id' =>1,
                'day' =>1,
            ),
        ),
        array(
            '$group' => array(
                '_id'  => array(
                    'user_id' => '$user_id',
                 'day' => '$day'),
                'count' => array('$sum' => 1),
            )
        ));

So for the collection above the results should be

Monday = 3     Tues = 2,     Wed = 1,     Thur = 1 and    Friday = 1

but it does not group the totals of all DISTINCT users_id under a day, and instead for each day it gives me a total for each existing user_id.`

Results (not complete)

     [result] => Array
    (
        [0] => Array
            (
                [_id] => Array
                    (
                        [user_id] => 1
                        [day] => Friday
                    )

                [count] => 1
            )

        [1] => Array
            (
                [_id] => Array
                    (
                        [user_id] => 1
                        [day] => Wednesday
                    )

                [count] => 1
            )

        [2] => Array
            (
                [_id] => Array
                    (
                        [user_id] => 2
                        [day] => Tuesday
                    )

                [count] => 1
            )
... ... ...

Can someone help me to filter the daily totals so that it only includes distinct totals per day

I have looked at the $unwind but couldn't really get my head around it. `

Ferox
  • 461
  • 4
  • 13

1 Answers1

2

If I'm understanding the question right, what you're trying to get at is

 totals of all DISTINCT users_id under a day

Or as I understand it: Count of unique user_ids per day.

For that, you could take the group you already have and cut out the count so that you just have a unique _id.user_id and _id.day value:

'$group' => array(
            '_id'  => array(
                'user_id' => '$user_id',
                'day' => '$day'
            )
        )

Then pipe that to another $group statement that counts the number of documents per day, since there is exactly one for every unique user_id/day combination:

'$group' => array(
            '_id'  => '$_id.day',
            'count' => array('$sum' => 1)
        )
Verran
  • 3,942
  • 2
  • 14
  • 22
  • Can you please have a look at [this other question](http://stackoverflow.com/questions/25150635/mongo-age-group-aggregation) for me. Is a identical scenario and i just cannot figure out. Thanks in advance – Ferox Aug 06 '14 at 00:53