0

How do I manage the total number of connections with the datastax/php-driver?

We're running into an issue with excessive TCP connections, and we suspect it's related to how this driver works.

We've moved off YACassandra PDO driver onto this one. One of the biggest issues I'm discovering is the connection pool will connect to all servers within a cluster for each HTTP thread.

We have 4 boxes in our cluster. That's 4 open persistent connections, per Apache child. I suspect this is a culprit to our troubles.

Historically, we've used YACassandra PDO, which only used 1 connection per thread.

How do we optimize this? Is there anyway to stop datastax/php-drive from doing cluster discovery?

guice
  • 976
  • 4
  • 11
  • 31

1 Answers1

0

The driver will automatically discover nodes in the cluster and based on the load balancing policy establish connections to each node when the session (connection) is established. To limit the connection to a particular host you can utilize the whitelist policy; however it is not recommended as this defeats the benefits or routing requests in the event of down/unavailable hosts. The whitelist policy has its place, but mainly for testing purposes in my opinion.

Another issue that might be occurring is with forking. The problem is all the children and the parent processes share the same underlying sockets and there isn't a portable way for the php-driver to handle this. Here is an example of how you can handle this in your application code when forking:

$cluster = Cassandra::cluster()->build();

// fork!

if ($pid) { // parent process
    $session = $cluster->connect();

    // do parent stuff
} else { // child process
    $session = $cluster->connect();

    // do child stuff
}
Fero
  • 406
  • 2
  • 6
  • Everything is running fine, until we get too many TCP connections open because Apache is creating a new httpd thread to handle extra requests. My issue is we actually are white-listing our server nodes: two for tracking, two for reporting, but they are the same cluster. Previously with PDO, threads would open a singular connection using a random/round-robin connection to Cassandra. I don't like the auto-discovery of php-driver – guice Dec 12 '17 at 19:03
  • Understood. It sounds like you want quick connections that die quickly and this could be done with [persistent sessions](http://docs.datastax.com/en/developer/php-driver/latest/api/Cassandra/Cluster/class.Builder/#method-withPersistentSessions) and [schema metadata](http://docs.datastax.com/en/developer/php-driver/latest/api/Cassandra/Cluster/class.Builder/#method-withSchemaMetadata) disabled along with the whitelist policy to force the single node connection. This isn't ideal though as there could be increased IO threads and long connection times. – Fero Dec 12 '17 at 20:25
  • "increased IO threads and long connection times." -- That worries me. For now, I'll disable schema metadata, but keep persistent connections + the whitelist to help minimize the tcp impact. So far, it's looking stable. – guice Dec 12 '17 at 21:19
  • An update: added white list, but that didn't fix things. Within minutes of adding to our cluster, the server completely buckled. Everything is fine on YACassandra PDO. Is there a (still supported) PDO Cassandra extension? Our server gets thousands of hits per minute, only thing change is we moved to php-driver, still using mod_php. We need a driver that can connect and drop quickly, like PDO. Our code is supposed to be in-and-out in milliseconds, just what Cassandra's designed for. Would this handle better under fpm? – guice Dec 29 '17 at 19:07