8

I have a MongoDB database with 10-12k documents in a collection and I'm experiencing really slow queries when attempting to fetch all documents, like this:

Sales.find()
    .where('author').equals(author)
    .where('date').gt(startDate.unix()).lt(endDate.unix())
    .exec(function(err, results) {
        callback();
    });

This query fetches around 10.5k documents and it takes 1000-1300ms to execute. I tried removing the "where" conditions - it only makes it slower (more documents fetched?).

Does the problem come from Mongoose, MongoDB, JavaScript or Node? I used to run PHP/MySQL database and it was 10-20 times faster in similar conditions, like fetching 10k+ rows of data. What am I doing wrong?

EDIT

Sales schema:

var salesSchema = new Schema({
    author: String,
    kind: String,
    productID: String,
    description: String,
    date: String,
    amount: String,
    transactionID: {
        type: String,
        unique : true
    }
});

Query result from the RoboMongo desktop client:

db.getCollection('sales').find({}).explain()

executionTimeMillis: 46
nReturned: 10359
Nikolay Dyankov
  • 6,491
  • 11
  • 58
  • 79
  • 2
    Can you edit your question to include the schema definition of the `Sales` model, any indexes on the underlying collection, output from the [**`.explain()`**](http://docs.mongodb.org/manual/reference/method/cursor.explain/) query plan for the mongo shell `db.collection.find()` method with the sample query? – chridam Oct 13 '15 at 13:45
  • Yes, I will do that, thanks! – Nikolay Dyankov Oct 13 '15 at 14:01
  • I can see `date` is a String but your mongoose `Sales.find()` method is doing a date range query based on a unix timestamp field which is an integer, also it looks like you don't have any indexes defined on the `date` field so there might be a possibility that mongodb is doing a full collection scan. Can you also add the output from `db.getCollection('sales').find({author: author, date {gt: startDate, lt: endDate }}).explain()`? – chridam Oct 13 '15 at 14:21
  • Something is wrong I think. "db.getCollection('sales').find({author: 'nickys', date: {gt: '1325336400', lt: '1420030800' }}).explain()" doesn't return any results. I tried removing the quotes from the timestamps, but it didn't help. It works if I remove the entire date condition. Forgive me, I'm new to Mongo :) – Nikolay Dyankov Oct 13 '15 at 14:33
  • Are you flexible enough to change the schema design? Was going to suggest that it might help with the query performance if you can convert the date field with the 'string' timestamp to a proper ISODate, then create an index on that field. – chridam Oct 13 '15 at 14:47
  • Yes, I can change the format of the date field. About making it an index - you mean a unique field? That will be a problem, because there will be duplicates once the database gets large enough. – Nikolay Dyankov Oct 13 '15 at 14:58
  • It doesn't necessarily have to be a unique index, you could create it as a [**Single Field Index**](http://docs.mongodb.org/manual/core/index-single/) on the `date` field or as a [**Compound Index**](http://docs.mongodb.org/manual/core/index-compound/) on the `author` and `date` fields since your find query uses both fields. – chridam Oct 13 '15 at 15:19
  • 1
    I will read more about those indexes and update the post when I run the tests again. Thanks! – Nikolay Dyankov Oct 13 '15 at 15:42
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/92185/discussion-between-chridam-and-nikolay-dyankov). – chridam Oct 13 '15 at 19:29
  • doing range queries on a string is always expensive, especially if you don't have an index. You should consider making the date field either `Date` (in which case you'd save date objects) or `Number` (in which case you'd save them as a unix timestamp) and add an index – kio Oct 27 '15 at 23:53
  • I agree @kio, but in my case significant overhead was caused by Mongoose, not MongoDB. Check out my answer below. – Nikolay Dyankov Oct 28 '15 at 12:18

1 Answers1

14

The problem came from Mongoose. By default, find() will return documents as Mongoose Documents, which costs a lot. By adding lean() to the query, documents are returned as plain JavaScript objects and for this case of 10k+ returned documents, the query time got reduced 3-5 times.

Sales.find()
    .where('author').equals(author)
    .where('date').gt(startDate.unix()).lt(endDate.unix())
    .lean()
    .exec(function(err, results) {
        callback();
    });

Read more here: http://www.tothenew.com/blog/high-performance-find-query-using-lean-in-mongoose-2/

Nikolay Dyankov
  • 6,491
  • 11
  • 58
  • 79