5

I have collection with documents like this :

{
    "_id" : ObjectId("5c0685fd6afbd73b80f45338"),
    "page_id" : "1234",
    "category_list" : [  
        "football", 
        "sport"
    ],
    "time_broadcast" : "09:13"
}

{
    "_id" : ObjectId("5c0685fd6afbd7355f45338"),
    "page_id" : "1234",
    "category_list" : [ 
        "sport",
        "handball"
    ],
    "time_broadcast" : "09:13"
}

{
    "_id" : ObjectId("5c0694ec6afbd74af41ea4af"),
    "page_id" : "123456",
    "category_list" : [ 
        "news", 
        "updates"
     ],
     "time_broadcast" : "09:13"
}

....

now = datetime.datetime.now().time().strftime("%H:%M")

What i want is : when "time_broadcast" is equal to "now",i get list of distinct "category_list" of each "page_id".

Here is how the output should look like :

{
   { 
     "page_id" : "1234",
     "category_list" : ["football", "sport", "handball"] 
   },

   { 
     "page_id" : "123456",
     "category_list" : ["news", "updates"] 
   }
}

I have tried like this :

category_list = db.users.find({'time_broadcast': now}).distinct("category_list")

but this gives me as output list of distinct values but

of all "page_id" :

 ["football", "sport", "handball","news", "updates"] 

not category_list by page_id .

Any help please ?

Thanks

Wiem
  • 77
  • 7

1 Answers1

5

you need to write an aggregate pipeline

  • $match - filter the documents by criteria
  • $group - group the documents by key field
  • $addToSet - aggregate the unique elements
  • $project - project in the required format
  • $reduce - reduce the array of array to array by $concatArrays

aggregate query

db.tt.aggregate([
    {$match : {"time_broadcast" : "09:13"}}, 
    {$group : {"_id" : "$page_id", "category_list" : {$addToSet : "$category_list"}}}, 
    {$project : {"_id" : 0, "page_id" : "$_id", "category_list" : {$reduce : {input : "$category_list", initialValue : [], in: { $concatArrays : ["$$value", "$$this"] }}}}}
]).pretty()

result

{ "page_id" : "123456", "category_list" : [ "news", "updates" ] }
{
        "page_id" : "1234",
        "category_list" : [
                "sport",
                "handball",
                "football",
                "sport"
        ]
}

you can add $sort by page_id pipeline if required

Saravana
  • 12,647
  • 2
  • 39
  • 57