When our PODs gets exhausted and could need some scaling up, Kubernetes confuses this exhaustion with death and restarts our POD:s. This has, of course, the opposite effect, the remaining POD:s gets even more load... So here comes my question, can you serve Kubernetes and LB liveness and readiness endpoints with dedicated, non-exhausted, connections?
We have an older system running in Kubernetes, one Apache httpd and one tomcat bundled in each POD. Load balancing is done by Kubernetes between different POD:s, not in httpd. Httpd is running mpm_event+mod_jk and there is an AJP 1.3 connection to the Tomcat. Httpd is also serving some static resources from disc without the Tomcat. When something fails, we quickly run out of AJP threads and HTTPD workers.
Basically what we see is this:
- The application fails to connect to some resource. Some network, Memcached, DB or other service starts to time out. Waiting on timeouts causes threads to be very long-lived and we run out of them quickly.
- Readiness/Liveness probs do not respond in time, Kubernetes restarts the POD (or, after we removed the liveness probe, the LB that uses readiness removes them from load balancing, having basically the same effect).
- Root cause problem is solved (somehow), but now there are too few (on non) POD:s left in Load balancing. When a POD gets back, it's hit by all the traffic, gets exhausted, and are removed from LB since it's too slow on the readiness probe again.
- We now find it very difficult to get out of this state... (So far it happened twice, and we basically had to cut off all traffic on Cloudflare WAF until enough POD:s were restarted/in Loadbalancing...)
My idea of a solution:
I think I can open a prioritised fastlane from httpd->tomcat for the liveness and readiness endpoints, see below. But, can I somehow dedicate workers in httpd (mpm_event) to these endpoints? Else, when I run out of httpd workers, my fast lane will not offer any help I guess. Or any other ideas about how to ensure that we always can serve liveness/readiness as long as the tomcat is alive, even when it is exhausted?
This is my current httpd worker setup:
<IfModule mpm_event_module>
StartServers 3
ServerLimit 36
MinSpareThreads 75
MaxSpareThreads 250
ThreadsPerChild 25
MaxRequestWorkers 900
MaxConnectionsPerChild 0
</IfModule>
Maybe it takes a worker just to analyze the request and figure out the URI... :-/ Or can I somehow dedicate a specific pool of workers to liveness and readiness???
My httpd->tomcat fastlane:
I was playing around with a second AJP connection to the tomcat, dedicated to the readiness and liveness endpoints. At a glance, it seems to work.
In server.xml I added a connector on port 8008:
<Connector
port="8009"
protocol="AJP/1.3"
redirectPort="8443"
connectionTimeout="60000"
minSpareThreads="2"
maxThreads="20"
acceptorThreadCount="2"
URIEncoding="UTF-8"
address="127.0.0.1"
secretRequired="false" />
<!--
This is the prioritized connector used for health checks.
-->
<Connector
port="8008"
protocol="AJP/1.3"
redirectPort="8443"
connectionTimeout="-1"
keepAliveTimeout="-1"
acceptorThreadPriority="6"
minSpareThreads="2"
maxThreads="5"
acceptorThreadCount="1"
URIEncoding="UTF-8"
address="127.0.0.1"
secretRequired="false" />
In my workers.properties
(the JkWorkersFile) I added the new connection and named it ajp13prio
:
worker.list=ajp13,ajp13prio
worker.ajp13.type=ajp13
worker.ajp13.port=8009
worker.ajp13.host=127.0.0.1
worker.ajp13.lbfactor=1
worker.ajp13prio.type=ajp13
worker.ajp13prio.port=8008
worker.ajp13prio.host=127.0.0.1
worker.ajp13prio.lbfactor=1
In my httpd conf I configured the probes to use the new connector:
<VirtualHost *:80>
...
# health checks (readiness and liveness probes) are prioritized
JkMount /api/v2/health/* ajp13prio
# All requests go to worker1 by default
JkMount /* ajp13
...
</VirtualHost>