6

I want MongoDB to hold query results in RAM for longer period of time (say 30 minutes if memory is available). Is it possible? OR is there any way i can make sure that the data is pre-loaded into RAM before subsequent queries on it.

In fact i am wondering about simple query results performance by MongoDB. I have a dedicated server with 10GB RAM and my db.stats() are as follows;

db.stats();
{
    "db": "test",
    "collections":16,
    "objects":625690,
    "avgObjSize":68.90,
    "dataSize":43061996,
    "storageSize":1121402888,
    "numExtents":74,
    "indexes":25,
    "indexSize":28207200,
    "fileSize":469762048,
    "nsSizeMB":16,
    "ok":1
}

Now when i query single document (as mentioned here) from a web service it loads in 1.3 seconds. Subsequent calls of same queries gives response in 400ms and then after few seconds, it again starts taking 1.3 seconds. Looks like MongoDB has lost the previous queried document from Memory, where as there is no other queries asking for data mapped to RAM.

Please explain this and let me know any way to make subsequent queries faster responding.

Community
  • 1
  • 1
theGeekster
  • 6,081
  • 12
  • 35
  • 47
  • http://www.mongodb.org/display/DOCS/Queries+and+Cursors This doc will surely help you. – Nilam Doctor Nov 14 '12 at 07:32
  • This almost certainly has very little to do with MongoDB. Additionally if you want the query results cache then cache them reliably within your application. – Remon van Vliet Nov 14 '12 at 12:19
  • 1
    Cache query results within application? Do you mean that I should use MemCached kind of thing within my web application? If that's what you mean then for what reason should MongoDB benefit us as they claim to be in-memory database? Also I have found several use cases on web, people removing their MemCached over DBMS and only using MongoDB. What exactly one should use to get MemCached equivalent performance from MongoDB? – theGeekster Nov 14 '12 at 16:53

1 Answers1

9

Your observed performance problem on an initial query is likely one of the following issues (in rough order of likelihood):

1) Your application / web service has some overhead to initialize on first request (i.e. allocating memory, setting up connection pools, resolving DNS, ...).

2) Indexes or data you have requested are not yet in memory, so need to be loaded.

3) The Query Optimizer may take a bit longer to run on the first request, as it is comparing the plan execution for your query pattern.

It would be very helpful to test the query via the mongo shell, and isolate whether the overhead is related to MongoDB or your web service (rather than timing both, as you have done).

Following are some notes related to MongoDB.

Caching

MongoDB doesn't have a "caching" time for documents in memory. It uses memory-mapped files for disk I/O and the documents in memory are based on your active queries (documents/indexes you've recently loaded) as well as the available memory. The operating system's virtual memory manager is in charge of caching, and typically will follow a Least-Recently Used (LRU) algorithm to decide which pages to swap out of memory.

Memory Usage

The expected behaviour is that over time MongoDB will grow to use all free memory to store your active working data set.

Looking at your provided db.stats() numbers (and assuming that is your only database), it looks like your database size is current about 1Gb so you should be able to keep everything within your 10Gb total RAM unless:

  • there are other processes competing for memory
  • you have restarted your mongod server and those documents/indexes haven't been requested yet

In MongoDB 2.2, there is a new touch command you can use to load indexes or documents into memory after a server restart. This should only be used on initial startup to "warm up" the server, as otherwise you could be unhelpfully forcing actual "active" data out of memory.

On a linux system, for example, you can use the top command and should see that:

  • virtual bytes/VSIZE will tend to be the size of the entire database
  • if the server doesn't have other processes running, resident bytes/RSIZE will be the total memory of the machine (this includes file system cache contents)
  • mongod should not use swap (since the files are memory-mapped)

You can use the mongostat tool to get a quick view of your mongod activity .. or more usefully, use a service like MMS to monitor metrics over time.

Query Optimizer

The MongoDB Query Optimizer compares plan execution for a query pattern every ~1,000 write operations, and then caches the "winning" query plan until the next time the optimizer runs .. or you explicitly call an explain() on that query.

This should be a straightforward one to test: run your query in the mongo shell with .explain() and look at the ms timings, and also the number of index entries and documents scanned. The timing for an explain() isn't the actual time the queries will take to run, as it includes the cost of comparing the plans. The typical execution will be much faster .. and you can look for slow queries in your mongod log.

By default MongoDB will log all queries slower than 100ms, so this provides a good starting point to look for queries to optimize. You can adjust the slow ms value with the --slowms config option, or using the Database Profiler commands.

Further reading in the MongoDB documentation:

Stennie
  • 63,885
  • 14
  • 149
  • 175
  • My Application/Web service definitely has an over head and that's all taking 100s of milliseconds. MongoDB actually takes no more than 5ms to serve my queries from mongo shell. Should i use some MemCached like thing to store any query's result (i don't prefer this though)? OR Should i run db.touch command from web application every 10 minutes to load previously queried collection ? OR any other option you can suggest ? Can you please tell any way to see if which collection/document is currently in the RAM ? – theGeekster Nov 14 '12 at 17:47
  • You definitely don't want to abuse the `touch` command; as mentioned, that's only helpful on startup to speed up the natural process by which frequently used data ends up in memory as requested by your application. If the MongoDB query is taking 5ms from the shell, it sounds like your performance issue is entirely within the web service stack and you should be focusing on tuning that. How are you measuring the web server request time .. directly from the web server (via localhost) or from a remote connection? You need to isolate and profile the separate aspects of the request (db,app,net,...). – Stennie Nov 14 '12 at 20:30
  • So you mean that using touch frequently is not what its intended for. Ok. I have setup MongoDB on a separate server and asp.net web application on a separate server. And i am using FireBug to tell me how much time it takes, starting from user initiates a web service call (using JSON) from web page and that web service directly fetches data from MongoDB server and returns back to web page. FireBug tells the time like 1.3 seconds to complete this cycle. Can you please let me know how can i optimize my web service side to minimize the time if possible? – theGeekster Nov 15 '12 at 04:04
  • 1
    As mentioned, you have to consider the timing for each step separately. From your earlier comment the MongoDB query is only 5ms out of 400-1300ms so you need to look at the average timing for other aspects of a web service request such as: transferring data from the database to application server; application processing the request; application returning the request to your browser; browser rendering the request, etc. One solution may be caching results in your application if the time transferring or processing data is significant, but you need to know timings to target what needs optimizing. – Stennie Nov 15 '12 at 05:13
  • Ok, thank you for your feedback. I shall be considering some caching scheme at web site level, having said that there is no way to retain queried MongoDB data into RAM for some time. – theGeekster Nov 15 '12 at 17:12