1

I'm trying to aggregate over saved user sessions and get the amount of single active visitors per date in a date range.

I have a session model which contains these properties:

{
  'environment' : ObjectId("xxxxxxxxxxxxxxxxxxxxxxxx"),
  'created'     : ISODate("2021-01-05T22:02:25.757Z"),
  'visitor'     : ObjectId("xxxxxxxxxxxxxxxxxxxxxxxx")
}

The result should look like this:

[
  {
    'date'  : '2021-01-03',
    'count' : 5 // this is the amount of unique visitors. If there are two documents with the same date and visitor id, only one should be count.
  },
  {
    'date'  : '2021-01-05',
    'count' : 15 
  },
  {
    'date'  : '2021-01-06',
    'count' : 11
  },
  ...etc...
]

This is the last pipeline I tried which of course is wrong:

const data = await Session.aggregate([
  {
    '$match': {
       environment : ObjectID( args.environment ),
       created     : { '$gte': new Date( args.start ), '$lte': new Date( args.end ) }
    }
  },
  {
    $addFields:{
      createdDate:{
        $dateFromParts:{
          year:{
            $year:"$created"
          }, 
          month:{
            $month:"$created"
          }, 
          day:{
            $dayOfMonth : "$created" 
          }
        }
      }
    }
  },
  {
    $group:{
      _id:{
        date:"$createdDate",visitor:"$visitor"
      },
      count:{$sum:1}
    }
  },
  {
    $project:{
      _id:0,
      date:"$_id.date",
      count:"$count",
    }
  }
])

I tried a few of my own and a few SO combinations for my pipeline but no great success yet.

Help would be very much appreciated.

Getter Jetter
  • 2,033
  • 1
  • 16
  • 37

2 Answers2

1

I think what you are searching for is the $addToSet operator.

Returns an array of all unique values that results from applying an expression to each document in a group of documents that share the same group by key.

The doc : https://docs.mongodb.com/manual/reference/operator/aggregation/addToSet/

You just need to group by day and add the visitors id to the set, if they exist they are not added if not they are, and kaboom. After that you just need to count how many elements in that list.

const data = await Session.aggregate([
  {
    $match: {
      environment : ObjectID( args.environment ),
      created     : { '$gte': new Date( args.start ), '$lte': new Date( args.end ) }
    }
  },
  {
    $group: {
      _id: {
        $dateToString: {
          format : "%Y-%m-%d",
          date   : "$created"
        },
      },
      visitors: { $addToSet: "$visitor" }
    }
  },
  {
    $project: {
      _id: 0,
      date: "$_id",
      count: { $size: "$visitors" }
    }
  }
])
Double Bang
  • 184
  • 15
  • And kaboom works. Well almost. But it hinted me in the right direction. See this fiddle for my final pipeline. If you want you can update your answer with it and I'll accept it. So people in future have a working snipped to look at. Thanks a lot, I'm very happy with the fact that the pipeline stayed so compact. https://jsfiddle.net/k06rndjm/2/ – Getter Jetter Jan 07 '21 at 20:16
  • My pleasure, It's easier to read when kept compact too ! – Double Bang Jan 08 '21 at 15:46
0

You can use a project stage with $dateToString, in addition it allows to specify a timezone if needed

 const data = await Session.aggregate([
  {
    '$match': {
       environment : ObjectID( args.environment ),
       created     : { '$gte': new Date( args.start ), '$lte': new Date( args.end ) }
    }
  },
  {
    $project:{
      visitor: 1,
      createdDate: { $dateToString: { format: "%Y-%m-%d", date: "$created" } },
    }
  },
  {
    $group:{
      _id:{
        date:"$createdDate",visitor:"$visitor"
      },
      count:{$sum:1}
    }
  },
  {
    $project:{
      _id:0,
      date:"$_id.date",
      count:"$count",
    }
  }
])
Daphoque
  • 4,421
  • 1
  • 20
  • 31