1

I created a multi-key compound index via Casbah (Scala library for Mongo):

db.collection.ensureIndex(MongoDBObject("Header.records.n" -> 1) ++ MongoDBObject("Header.records.v" -> 1) ++ MongoDBObject("Header.records.l" -> 1))

Then, via the Mongo Shell I had performed a db.collection.find(...).explain where the nScannedObjects exceeded the db.collection.count(). Looking at the Mongo docs, it appears that ensureIndex needs to be called once, and then any writes will force an update of the index.

However, I saw a post and this one that it's only required to call db.collection.ensureIndex(...) once.

EDIT

>db.collection.find( {"Header.records" : {$all : [ 
{$elemMatch: {n: "Name", v: "Kevin", 
                         "l" : { "$gt" : 0 , "$lt" : 15}} }]}}, 
             {_id : 1}).explain()
    {
            "cursor" : "BtreeCursor         
     Header.records.n_1_Header.records.v_1_Header.records.l_1",
            "isMultiKey" : true,
            "n" : 4098,
            "nscannedObjects" : 9412,
            "nscanned" : 9412,
            "nscannedObjectsAllPlans" : 9412,
            "nscannedAllPlans" : 9412,
            "scanAndOrder" : false,
            "indexOnly" : false,
            "nYields" : 0,
            "nChunkSkips" : 0,
            "millis" : 152,
            "indexBounds" : {
                    "Header.records.n" : [
                            [
                                    "Name",
                                    "Name"
                            ]
                    ],
                    "Header.records.v" : [
                            [
                                    "Kevin",
                                    "Kevin"
                            ]
                    ],
                    "Header.records.l" : [
                            [
                                    0,
                                    1.7976931348623157e+308
                            ]
                    ]
            },
            "server" : "ABCD:27017"

Note that nScanned (9412) > count(4248).

> db.collection.count()
4248

Why?

Community
  • 1
  • 1
Kevin Meredith
  • 41,036
  • 63
  • 209
  • 384
  • the update is atomic to the documents own update, it is instant basically – Sammaye Oct 22 '13 at 19:48
  • but, why am I looking at `db.collection.count()` to be **4000**, but running `db.collection.find(...).explain()` shows `nScannedObjects` too be **9000**. – Kevin Meredith Oct 22 '13 at 19:49
  • can you provide the actual explain? It sounds like a badly used index – Sammaye Oct 22 '13 at 19:50
  • `db.claims.find( {"recordss" : {$all : [ {$elemMatch: {n: "Name", v: "Kevin", "l" : { "$gt" : 0 , "$lt" : 15}} }]}}, {_id : 1}).expla in()` I'm using a n-v-l database structure similar to this post - http://edgystuff.tumblr.com/post/47178201123/mongodb-indexing-tip-3-too-many-fields-to-index-use – Kevin Meredith Oct 22 '13 at 20:27
  • can you edit your question with the explain results? – Sammaye Oct 22 '13 at 20:48
  • To answer your first question multikey indexes produce one value per entry in the index as such this means that a single document can take up more more than one space in the index which explains `nscanned` but `nScannedObjects` should relate to documents, I'll need to test this some more – Sammaye Oct 22 '13 at 21:18
  • @Sammaye, looks like the answer might be `About nscanned exceeding the count, that is probable since you actually have way more index entries than you have documents: each item in your list is an index entry.` source - https://jira.mongodb.org/browse/SERVER-10436?focusedCommentId=445006&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-445006. – Kevin Meredith Oct 23 '13 at 00:39
  • Yeah that's I said, though didn't know nscannedobjects was just a counter, that doesn't make it very reliable... – Sammaye Oct 23 '13 at 07:10
  • Why's it not reliable? – Kevin Meredith Oct 23 '13 at 10:30
  • Cos you assume it will dictate how many unique documents it had to look at it, if it counts duplicate documents you can't reliably judge when index usage is going hay wire – Sammaye Oct 23 '13 at 10:37

1 Answers1

1

About "nscanned" exceeding the count, that is probable since you actually have way more index entries than you have documents: each item in your list is an index entry. It seems like here you have on average 2 items in list per document. "nscannedObjects" follows the same principle since that counter is incremented whenever a document is looked at, even if the same document was already looked at earlier as part of the same query.

agirbal
  • 171
  • 2
  • So, if I have 2000 documents. On average, each document has an array of 3 sub-documents. If, of the 2000 documents, 2/3 sub-documents match `n: "Name", v: "Kevin", l: {$gt: 0, $lt: 15}`, then `nScanned` should equal **4000** (2000 documents times 2 sub-documents matching per document)? – Kevin Meredith Oct 23 '13 at 02:10