In GCP's documentation, Google claims to support up to 1million queries per second. My team, however, as part of a project, decided to put both the Regional HTTPS LB and Global HTTPS LB to test.
Here are some of the results we got using 7 clients against 4 n2d-highcpu-64
vms with 2TB - SSD Persistent Disk
for getting random key.
For 30s and 4300 long persisted connections;
- Without the load balancer each instance returned an average of
680k qps
. - Using a Regional HTTPS load balancer with the 4vms as backend service, the results were about
150k qps
. - Using a Global HTTPS load balancer with the same 4vms, and no Cloud Armor, it averaged
205k qps
.
My questions therefore are as follows:
Is there anything within the Load-balancer config responsible for this throttling experienced?
Is there some documentation on the recommended architecture or best practice to achieving at least 1 million qps with the Load Balancer?