0

The scenario is that:

  • I am querying for data (duh!)

  • Potentially filtering some out server-side because to query exactly is not possible/reasonable i.e. complex query

  • The database may be under considerable load. There may be numerous parallel requests including updates.

So, I could

a) Not limit() the query and just keep streaming data until I get enough. However, response time matters, so if the desired data is too sparse, a partial set may have to be returned before a whole page is retrieved.

b) Use limit() but occasionally re-query a couple of times in an attempt to retrieve a whole page of data. Again, the final result still may not be an entire set. The thinking here is that making a couple extra requests would be less load on the database.

I understand this is likely a "it depends" but I'm wondering if anyone has some insight to best practices or the best starting point to tune from.

Rick Cotter
  • 108
  • 1
  • 6

1 Answers1

2

Option (a) will serve to put additional excessive load on a database you say is already under considerable load. The reason is that if there is no limit, your nscan on your explain command is likely to be large, resulting in heavy MongoDB server load. This would be a pretty bad idea, in my view.

You can use limit, but be aware that simple skip and limit get more and more expensive as the skip size increases. From MongoDB documentation:

Unfortunately skip can be (very) costly and requires the server to walk from the beginning of the collection, or index, to get to the offset/skip position before it can start returning the page of data (limit). As the page number increases skip will become slower and more cpu intensive, and possibly IO bound, with larger collections.

Range based paging provides better use of indexes but does not allow you to easily jump to a specific page.

What you are really looking for is range based pagination, as long as you have a unique column you can use to sort on and use $lt and $gt. See here for another example of how to implement range based paging.

Community
  • 1
  • 1
Zaid Masud
  • 13,225
  • 9
  • 67
  • 88