MongoDB index effectiveness evaluation best practices

Question

We would like to evaluate the effectiveness of our indexes in a MongoDB-based REST service setup. The idea is to populate a collection with a synthetic dataset (e.g. 10,000,000 documents) then run a load injector process doing random REST operations (each one involving a query at MongoDB layer) to evaluate which indexes are being used and statistical information about them (e.g. per index hit rate).

We have considered using explain() command or indexStats. However, regarding explain(), it has two problems: 1) it allows only evaluate the effectiveness of a single query, 2) it is difficult to use in a “black box” environment in which our load injector process interacts with the REST service on top on MongoDB but not MonoDB itself. Regarding indexStats, as far as I understand, it shows information about the index structure “on disk” but not about index usage.

Thus, which is the best way of doing that kind of test? Any procedure description or URL to information about the topic is highly welcomed.

If you know your queries explain() will work for telling you what indexes are used and information about their performance. Are you saying that you don't really know what queries the application is generating therefore you can't run them through explain() to see what indexes are going to "win"? — Bob Kuhar, Jun 05 '14 at 06:53
@BobKuhar the query depends on the REST API operation being used and can be difficult to model the "query pattern" associated to each operation. Thus, I wonder if there would be some "black box" approach as the one I introduce in the question body. However, if no other alternative exists (although Christian answer about serverStatus() sounds promissing), we will end with a modeling of the query patterns for each operaetion and use explain(). Thanks! — fgalan, Jun 05 '14 at 08:07

Christian P · Answer 1 · 2014-06-05T06:42:50.070

You should read about performance profiling.

You can turn on the profiling on with:

db.setProfilingLevel(2);

Or if you don't want to much noise and you want to check only queries that took more than e.g. 20ms:

db.setProfilingLevel(1,20);

You can query the system.profile to get the information about slow queries, e.g. find all operations slower than 30 ms:

db.system.profile.find( { millis : { $gt : 30 } } ).pretty()

You can than manually profile each slow query with explain().

For real-time monitoring you can use mongotop and mongostat. You should also consider setting up MMS (free monitoring service). In MMS you can check btree hits/misses and compare them to other relevant statistics.

Edit You can get the relevant indexCounters data by using serverStatus command:

db.serverStatus()

Thank you very much for the feedback about profiling, we'd take it into account. However, note that our original purpose was to measure how many times an index is hit/misses rather than detecting slows queries (although both things are related, of course). In that sense, what you tell about btree statistics sounds very promissing... is it possible to use it without involving MMS (to simplify our test setup without needing to involve a tool that I understand is aimed at production environments)? — fgalan, Jun 04 '14 at 14:05

MongoDB index effectiveness evaluation best practices

1 Answers1