I have a Spring-boot service, using Undertow, the primary clients of which are sensors at a client site (~250 such devices). These sensors send POSTs to the service every 10 seconds over the site WiFi - which is somewhat spotty in places. I am tracking the service in NewRelic and see occasional request-response times that are HOURS in length (typical response times are a few dozen millis). There is no processing on the service's controller - all payloads are cached off-thread and forwarded via a separate process. After about 15 hours or so, the service stops responding and needs to be restarted. I suspect these long-running requests are saturating the pool of threads used to handle requests from other sensors. NewRelic suggests that all errors encountered are much like the following:
I/O error while reading input message; nested exception is java.io.IOException:
UT000128: Remote peer closed connection before all data could be read
A high percentage of these errors have messages suggesting Exceptions in the Spring-boot JSON processor that complain of invalid/unexpected characters or closed inputs.
It seems as if some of the sensors are struggling to complete their POSTs. Is this a fair interpretation?
Is there a way that I can force my service to 'kill' these requests before they eat up all of my handler threads? I'm aware that a client-side circuit-breaker might be the best way to handle this, but I don't have a lot of control over that end of things just yet.
I'm also not wedded to Undertow as a Servlet container - Tomcat or Jetty would be just fine with me, if it makes skinning this cat a bit easier.
I have the following code in a @Configuration
class:
@Bean
public ServletWebServerFactory servletWebServerFactory() {
UndertowServletWebServerFactory factory = new UndertowServletWebServerFactory(contextPath, serverPort);
factory.addBuilderCustomizers((builder) -> {
...
builder.setServerOption(UndertowOptions.IDLE_TIMEOUT, 60000);
...
});
return factory;
}
But it does not seem to kill off the requests.