1

I'm using PHP 5.3.6 with PDO to access Postgres 9.0.4. I've been asked to reduce the memory footprint of a report. The current implementation is simple: execute the query, do a fetchAll() and then iterate with foreach() through the resulting array. This obviously doesn't scale with huge result sets: it can temporarily consume 100MB or more.

I have a new implementation which takes the PDO statement handle and then iterates directly on it using foreach(), i.e. no intermediate array via fetchAll(). (From what I've read, iterating a statement handle with foreach calls fetch() under the covers.) This is just as fast and consumes way less memory: about 28kB. Still, I'm not confident I'm doing it right because, although I've done a ton of Googling, it's tough to find answers to basic questions about this:

  • I've seen articles that suggest solving my original problem using cursors. Does the Postgress PDO driver already use cursors internally? If writing my own SQL to create a cursor is required, I'm willing to but I'd prefer to write the simplest code possible (but no simpler!).

  • If foreach calls fetch() each iteration, isn't that too network chatty? Or is it smart and fetches many rows at once, e.g. 500, to save bandwidth? (This may imply that it uses cursors internally.)

  • I've seen an article that wraps the statement handle in a class that implements Iterator interface. Isn't this redundant given that a PDO statement handle already does this? Or am I missing something?

  • My call to prepare the SQL statement looks like this:

    $sth = $dbh->prepare($sql);

I found that it made no memory or speed difference if I did this:

$sth = $dbh->prepare($sql, array( PDO::ATTR_CURSOR => PDO::CURSOR_FWDONLY ) );

Is this because this is the default anyway for the Postgres PDO driver? This would make sense if it is already using cursors internally.

General comments about the approach and other ways to solve this problem are welcome.

j0k
  • 22,600
  • 28
  • 79
  • 90
DaveBurns
  • 2,036
  • 2
  • 27
  • 37

2 Answers2

1

PDO for Postgres does use cursors internally.

Jon
  • 428,835
  • 81
  • 738
  • 806
  • Thanks, Jon. Indeed it does - I hadn't thought to look straight at the source. My reading of it says that there's no smarts for fetching more than 1 row at a time and buffering them client side to reduce network traffic. Unless that happens deeper down in the Postgres client that the driver calls but seems unlikely. I'll do some perf testing to see how my simple implementation works out with client and Postgres on different boxes. If it's too slow, I'll roll my own. – DaveBurns Oct 25 '11 at 04:36
  • FWIW, my take on the driver source is that it is not making use of the optional argument to prepare(). Would like to hear if anyone disagrees. – DaveBurns Oct 25 '11 at 04:40
0

Apparently PDO::CURSOR_FWDONLY does not use cursors. Black box tests:

(0) Preparations:

$con = new \PDO('dsn');
// you'll get "NO ACTIVE TRANSACTION" otherwise
$con->beginTransaction();

$sql = 'select * from largetable';

(1) Default - takes forever:

$stmt = $con->prepare($sql);
$stmt->execute();
print_r($stmt->fetch());

(2) FWDONLY - takes forever:

$stmt = $con->prepare($sql, array(\PDO::ATTR_CURSOR => \PDO::CURSOR_FWDONLY));
$stmt->execute();
print_r($stmt->fetch());

(3) SCROLLABLE - runs in a flash:

$stmt = $con->prepare($sql, array(\PDO::ATTR_CURSOR => \PDO::CURSOR_SCROLL));
$stmt->execute();
print_r($stmt->fetch());

I turned on PG logging just to be sure and it is indeed so - only SCROLL uses cursors.

So, the only way to make use of cursors is to use SCROLL, at least in PHP 5.4.23.

JohnSmith
  • 436
  • 5
  • 17