5

This question based on MongoDB,How to retrieve selected items retrieve by selecting multiple condition.It is like IN condition in Mysql

SELECT * FROM venuelist WHERE venueid IN (venueid1, venueid2)

I have attached json data structure that I have used.[Ref: JSON STRUCTUE OF MONGODB ].

As an example, it has a venueList then inside the venue list, It has several attribute venue id and sum of user agents name and total count as value.user agents mean user Os,browser and device information. In this case I used os distribution.In that case i was count linux,ubuntu count on particular venueid.

it is like that,

"sum" : [
    {
        "name" : "linux",
        "value" : 12
    },
    {
        "name" : "ubuntu",
        "value" : 4
    }
],

Finally I want to get count of all linux user count by selecting venueid list in one find query in MongoDB.

As example, I want to select all count of linux users by conditioning if venue id VID1212 or VID4343

Ref: JSON STRUCTUE OF MONGODB

{
    "_id" : ObjectId("57f940c4932a00aba387b0b0"),
    "tenantID" : 1,
    "date" : "2016-10-09 00:23:56",
    "venueList" : [
        {
            "id" : “VID1212”,
            "sum" : [
                {
                      "name" : "linux",
                      "value" : 12
                },
                {
                    "name" : "ubuntu",
                    "value" : 4
                }
            ],
            “ssidList” : [    // this is list of ssid’s in venue
                {
                    "id" : “SSID1212”,
                    "sum" : [
                        {
                            "name" : "linux",
                            "value" : 8
                        },
                        {
                            "name" : "ubuntu",
                            "value" : 6
                        }
                    ],
                    “macList” : [  // this is mac list inside particular ssid  ex: this is mac list inside the SSID1212
                        {
                            "id" : “12:12:12:12:12:12”,
                            "sum" : [
                                {
                                    "name" : "linux",
                                    "value" : 12
                                },
                                {
                                    "name" : "ubuntu",
                                    "value" : 1
                                }
                            ]
                        }
                    ]
                }
            ]
        },
        {
            "id" : “VID4343”,
            "sum" : [
                {
                     "name" : "linux",
                     "value" : 2
                }
            ],
            "ssidList" : [
                {
                    "id" : “SSID4343”,
                    "sum" : [
                        {
                            "name" : "linux",
                            "value" : 2
                        }
                    ],
                    "macList" : [
                        {
                            "id" : “43:43:43:43:43:34”,
                            "sum" : [
                                {
                                    "name" : "linux",
                                    "value" : 2
                                }
                            ]
                        }
                    ]
                }
            ]
        }
    ]
}

I am using golang as language to manipulation data with mongoldb using mgo.v2 package

expected out put is :

output

  • linux : 12+2 = 14
  • ubuntu : 4+0 = 4

Don't consider inner list in venuelist.

Sandun Priyanka
  • 591
  • 7
  • 24

1 Answers1

2

You'd need to use the aggregation framework where you would run an aggregation pipeline that first filters the documents in the collection based on the venueList ids using the $match operator.

The second pipeline would entail flattening the venueList and sum subdocument arrays in order for the data in the documents to be processed further down the pipeline as denormalised entries. The $unwind operator is useful here.

A further filter using $match is necessary after unwinding so that only the documents you want to aggregate are allowed into the next pipeline.

The main pipeline would be the $group operator stage which aggregates the filtered documents to create the desired sums using the accumulator operator $sum. For the desired result, you would need to use a tenary operator like $cond to create the independent count fields since that will feed the number of documents to the $sum expression depending on the name value.

Putting this altogether, consider running the following pipeline:

db.collection.aggregate([
    { "$match": { "venueList.id": { "$in": ["VID1212", "VID4343"] } } },
    { "$unwind": "$venueList" },
    { "$match": { "venueList.id": { "$in": ["VID1212", "VID4343"] } } },
    { "$unwind": "$venueList.sum" },    
    {
        "$group": {
            "_id": null,
            "linux": {
                "$sum": {
                    "$cond": [ 
                        { "$eq": [ "$venueList.sum.name", "linux" ] }, 
                        "$venueList.sum.value", 0 
                    ]
                }
            },
            "ubuntu": {
                "$sum": {
                    "$cond": [ 
                        { "$eq": [ "$venueList.sum.name", "ubuntu" ] }, 
                        "$venueList.sum.value", 0 
                    ]
                }
            }
        }
    }
])

For usage with mGo, you can convert the above pipeline using the guidance in http://godoc.org/labix.org/v2/mgo#Collection.Pipe


For a more flexible and better performant alternative which executes much faster than the above, and also takes into consideration unknown values for the sum list, run the alternative pipeline as follows

db.collection.aggregate([
    { "$match": { "venueList.id": { "$in": ["VID1212", "VID4343"] } } },
    { "$unwind": "$venueList" },
    { "$match": { "venueList.id": { "$in": ["VID1212", "VID4343"] } } },
    { "$unwind": "$venueList.sum" },    
    { 
        "$group": {
            "_id": "$venueList.sum.name",
            "count": { "$sum": "$venueList.sum.value" }
        }
    },
    { 
        "$group": {
            "_id": null,
            "counts": {
                "$push": {
                    "name": "$_id",
                    "count": "$count"
                }
            }
        }
    }
])
chridam
  • 100,957
  • 23
  • 236
  • 235
  • Thank you @chridam, I'll go through your answer and give a feedback soon. – Sandun Priyanka Oct 09 '16 at 15:37
  • Can we retrieve each count without mentioning name such as mentioning linux or ubuntu..because we cannot defined it. some time there can be different OS such as windows..can we achieve it. Please give soultion for that. @chridam – Sandun Priyanka Oct 09 '16 at 16:45
  • Yes, but that means your initial expected output structure will change since the alternative approach does not take into consideration the totals as key/value pair. Check updated answer. – chridam Oct 09 '16 at 16:52
  • Thanks @ chridam your support.I am appreciate that. I'll give update you after checking on my side. – Sandun Priyanka Oct 09 '16 at 17:23
  • @chiridam there was an error happen following like this. "Unrecognized pipeline stage name: 'match' – Sandun Priyanka Oct 10 '16 at 05:41
  • @SandunPriyanka That was a typo on my part, now corrected. – chridam Oct 10 '16 at 05:45
  • I also remove double quotation marks between aggregation stages then it is works.Thank you,It is reduce my time. – Sandun Priyanka Oct 10 '16 at 05:54
  • @SandunPriyanka No worries, always happy to help :) – chridam Oct 10 '16 at 05:56
  • @chiridam.. how I get distinct all useragents in the venuelist one query. i used following query but it is not given by aspected result db.useragents.aggregate([ { "$group": { "_id": "$venueList.sum.name" } }, { "$group": { "_id": null, "counts": { "$push": { "name": "$_id", } } } } ]) I want get array all distinct useagents OS list – Sandun Priyanka Oct 11 '16 at 11:27
  • Hard to follow what you are asking. Can you please open a new question detailing what is wrong and what your expected output is? Because it's not recommended here on StackOverflow to update an accepted answer where the [**question changes**](http://meta.stackexchange.com/questions/43478/exit-strategies-for-chameleon-questions). – chridam Oct 11 '16 at 11:32