5

When queries are made to mongodb, how does it's cursor deal with the result set in memory? Does the cursor retrieve all documents which match the query, at once? or does it retrieve 1 document at a time? or are they buffered? or is there a different solution I don't know about?

If it's a buffered solution, how are they stored on the server/client? How much data does the client keep locally?

Jim Rubenstein
  • 6,836
  • 4
  • 36
  • 54
  • Memory - which memory, Server or Client? A cursor never retrieves documents, it's a cursor. Buffered - again where? On Client or Server? Or are you only interested for the client as of PHP? – hakre May 05 '11 at 20:41
  • Both, really. I imagine the Server has to store the result somewhere. And, maybe I'm confused as to the job of the cursor - but the cursor does act as my "gateway" to the data, so it has to get it from somewhere, and that somewhere has to store it. I'm just trying to learn/figure out where the data is being stored and how it is accessed/buffered by the PHP client. – Jim Rubenstein May 05 '11 at 20:43
  • A cursor allows the client to navigate through the resultset available on the server. So you query 500 documents, server needs only say, here the cursor. client says thanks and then uses the cursors to navigate to somewhere within that set. Then client says: Give me document at where the cursor points. Server says: fine I know that cursor, so I can give you document. At least that's how I understand a cursor. The cursor itself never fetches but is used in communication between client and server. – hakre May 05 '11 at 20:46
  • Internals are here: [cursor.c](https://github.com/mongodb/mongo-php-driver/blob/master/cursor.c) – hakre May 05 '11 at 20:48

1 Answers1

5

The MongoDB wire protocol has specifications for batch size when issuing a query.

The basic premise is that the client driver issues a query with numberToReturn flag. If the query matches over the numberToReturn, then only that number is returned to the client.

So the server effectively sends one "batch" to the client. If the client cycles through the whole batch, the client issues a getmore request and receives the next batch. In the meanwhile, the server does not need to load all results into memory, only enough to satisfy the client's request.

The PHP driver abstracts away much of this complexity. All you do with the driver is request the next item and the driver will handle the getmore where appropriate.

In terms of size, you will get the smaller of Max BSON size or numberToReturn. So if the documents are too big, you may hit Max BSON size to prevent sending too much data at once.

The best spot to get any more details is the actual code.

Gates VP
  • 44,957
  • 11
  • 105
  • 108