6

I am running spring cloud gateway (which I understand to be built on Spring Webflux) behind an AWS loadbalancer and I am receiving intermittent 502 errors. Upon investigation, it appears the issue has to do with connection timeouts between the loadbalancer and my nodes. From some investigation it appears that the underlying netty server has a default timeout of 10 seconds. I determined this using the following command...

time nc -vv 10.10.xx.xxx 5100
Connection to 10.10.xx.xxx 5100 port [tcp/*] succeeded!

real    0m10.009s
user    0m0.000s
sys     0m0.000s

While I could just put the idleTimeout on the load balancer to something under 10 seconds, that feels very inefficient. I would like to keep it above 30 seconds if possible. Instead I would like to increase the connection timeout on the netty server. I have attempted to set the server.connection-timeout property in my application.yml...

server:
  connection-timeout: 75000

also by specifying seconds...

server:
  connection-timeout: 75s

But this has had no change on the timeout when I run the time command to see how long my connection lasts, it still ends at 10 seconds...

time nc -vv 10.10.xx.xxx 5100
Connection to 10.10.xx.xxx 5100 port [tcp/*] succeeded!

real    0m10.009s
user    0m0.000s
sys     0m0.000s

What am I missing here?

brunch
  • 623
  • 1
  • 6
  • 11

2 Answers2

9

The server.connection-timeout configuration key is not supported for Netty servers (yet), I've raised spring-boot#15368 to fix that.

The connection timeout is about the maximum amount of time we should wait to for a connection to be established. If you're looking to customize the read/write timeouts, those are different options. You can add a ReadTimeoutHandler that closes the connection if the server doesn't receive data from the client in the configured duration. Same thing with a WriteTimeoutHandler, but this time about the server writing data to the client.

Here's a complete example for that:

@Configuration
public class ServerConfig {

    @Bean
    public WebServerFactoryCustomizer serverFactoryCustomizer() {
        return new NettyTimeoutCustomizer();
    }

    class NettyTimeoutCustomizer implements WebServerFactoryCustomizer<NettyReactiveWebServerFactory> {

        @Override
        public void customize(NettyReactiveWebServerFactory factory) {
            int connectionTimeout = //...;
            int writetimeout = //...;
            factory.addServerCustomizers(server -> server.tcpConfiguration(tcp ->
                    tcp.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, connectionTimeout)
                            .doOnConnection(connection ->
                                    connection.addHandlerLast(new WriteTimeoutHandler(writetimeout)))));
        }
    }

}

Back to your question now, I've tested that configuration with the following controller:

@RestController
public class TestController {

    @GetMapping(path = "/", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
    public Flux<String> textStream() {
        return Flux.interval(Duration.ofSeconds(5)).map(String::valueOf);
    }
}

As long as the interval is shorter than the configured write timeout, the connection is not closed by the server. You can verify that with httpie and the following command http localhost:8080/ --stream --timeout 60.

I've tested this netcat command on my local machine and I'm hitting no timeout so far.

time nc -vv 192.168.0.28 8080
192.168.0.28 8080 (http-alt) open
^CExiting.
Total received bytes: 0
Total sent bytes: 0
nc -vv 192.168.0.28 8080  0.01s user 0.00s system 0% cpu 2:36.53 total

Maybe this is something configured at the OS level, or maybe a network appliance is configured to close such connections? I just saw that you added the spring-cloud-gateway label - maybe this is something specific to that project?

Brian Clozel
  • 56,583
  • 15
  • 167
  • 176
  • I have implemented this, but I still get the same result. Is the CONNECT_TIMEOUT property the one I should be using? Essentially I need the equivalent of Apache KeepAlive based on everything I have read for this scenario. Is there a KeepAlive property I can set in Netty? – brunch Dec 04 '18 at 04:53
  • that's a different question then - I've edited my answer – Brian Clozel Dec 04 '18 at 08:13
  • Thanks for all of your help on this, Brian. Unfortunately I am still running into the same issue. After I have applied your recommended approach, the server still terminates connections after 10 seconds. I am testing this using the "time nc" command mentioned about. – brunch Dec 04 '18 at 17:26
  • I've edited my answer (the code snippet was wrong). But I can't reproduce your issue. See the last section of my edited answer. – Brian Clozel Dec 04 '18 at 18:06
  • I need the ability to keep a connection open to the port on the server where netty is running without making a call. Tomcat does this out of the box with a default of 60 seconds. That is basically what I am looking for – brunch Dec 05 '18 at 14:40
  • Are you using Spring Cloud Gateway? Can you reproduce the same problem without Spring Cloud Gateway? Can you reproduce the same problem with Spring Cloud Gateway and Tomcat? – Brian Clozel Dec 05 '18 at 14:41
  • Yes, this happens with Spring Cloud Gateway. Since that is what we were running when we ran into this issue, I thought I would try a basic webflux app to see if it was Gateway causing the issue. But I get the same results using the basic webflux app. – brunch Dec 05 '18 at 14:43
  • As mentioned in my answer, I'm not getting the same results locally with my sample app. the `nc` command never times out. – Brian Clozel Dec 05 '18 at 14:47
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/184761/discussion-between-ben-runchey-and-brian-clozel). – brunch Dec 05 '18 at 15:00
  • Brian, sorry for the delay in getting back to this. We implemented a work around to get us by for a period of time. I have an example showing my issue. I would like to open an issue for it, but am uncertain where to create it. Should I put it in the spring project, spring-boot? – brunch Jan 17 '19 at 16:16
1

The spring documentation at https://docs.spring.io/spring-boot/docs/current/reference/html/common-application-properties.html currently defines server.connection-timeout as "Time that connectors wait for another HTTP request before closing the connection."

This is not what that property currently does, for Netty. Right now, the property controls the TCP connection handshake timeout, which is something completely different.

There is more information about this, an example of how to actually configure an idle/keep-alive timeout at https://github.com/spring-projects/spring-boot/issues/18473

Specifically, you can use something like this:

import io.netty.channel.Channel;
import io.netty.channel.ChannelDuplexHandler;
import io.netty.channel.ChannelHandlerContext;
import io.netty.channel.ChannelInitializer;
import io.netty.handler.timeout.IdleStateEvent;
import io.netty.handler.timeout.IdleStateHandler;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.boot.web.embedded.netty.NettyReactiveWebServerFactory;
import org.springframework.boot.web.reactive.server.ReactiveWebServerFactory;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

import java.time.Duration;

import static java.util.concurrent.TimeUnit.NANOSECONDS;

@Configuration
public class NettyConfig {

    @Bean
    public ReactiveWebServerFactory reactiveWebServerFactory(@Value("${server.netty.idle-timeout}") Duration idleTimeout) {
        final NettyReactiveWebServerFactory factory = new NettyReactiveWebServerFactory();
        factory.addServerCustomizers(server ->
                server.tcpConfiguration(tcp ->
                        tcp.bootstrap(bootstrap -> bootstrap.childHandler(new ChannelInitializer<Channel>() {
                            @Override
                            protected void initChannel(Channel channel) {
                                channel.pipeline().addLast(
                                        new IdleStateHandler(0, 0, idleTimeout.toNanos(), NANOSECONDS),
                                        new ChannelDuplexHandler() {
                                            @Override
                                            public void userEventTriggered(ChannelHandlerContext ctx, Object evt) {
                                                if (evt instanceof IdleStateEvent) {
                                                    ctx.close();
                                                }
                                            }
                                        }
                                );
                            }
                        }))));
        return factory;
    }

}
Bob
  • 21
  • 1