I have a scenario with Spring Boot Zuul as external Gateway and Eureka as Service Discovery, all this running in Kubernetes.
The thing is, I would like to guarantee my service's availability, so when of the instances of my service goes down, I expect Zuul to retry calling one of the other instances, through Eureka.
I tried doing this by following this Ryan Baxter's post. Plus, I tried to follow the tips from here.
The problem is that whatever I make, looks like Zuul is not retrying to make the call. When I remove one of my instances, it keeps returning me a Timeout for this instance, until Eureka addresses get synchronized.
My application.yaml looks like this:
spring:
cloud:
loadbalancer:
retry:
enabled: true
zuul:
stripPrefix: true
ignoredServices: '*'
routes:
my-service:
path: /my-service/**
serviceId: my-service-api
retryable: true
my-service:
ribbon:
maxAutoRetries: 3
MaxAutoRetriesNextServer: 3
OkToRetryOnAllOperations: true
ReadTimeout: 5000
ConnectTimeout: 3000
My service is using Camden SR7 (I also tried SR6):
"org.springframework.cloud:spring-cloud-dependencies:Camden.SR7"
And also Spring-retry:
org.springframework.retry:spring-retry:1.1.5.RELEASE
My application class looks like this:
@SpringBootApplication
@EnableEurekaClient
@EnableZuulProxy
@EnableRetry
public class MyZuulApplication
EDIT:
Making a get through Postman, it brings
{
"timestamp": 1497959364819,
"status": 500,
"error": "Internal Server Error",
"exception": "com.netflix.zuul.exception.ZuulException",
"message": "TIMEOUT"
}.
Taking a look at the Zuul logs, it printed {"level":"WARN","logger_name":"org.springframework.cloud.netflix.zuul.filters.post.SendErrorFilter","appName":...,"message":"Error during filtering","stack_trace":"com.netflix.zuul.exception.ZuulException: Forwarding error [... Stack Trace ...] Caused by: com.netflix.hystrix.exception.HystrixRuntimeException: my-service-api timed-out and no fallback available [... Stack Trace ...] Caused by: java.util.concurrent.TimeoutException: null
Another interesting log that I found:
{"level":"INFO" [...] current list of Servers=[ip_address1:port, ip_address2:port, ip_address3:port],Load balancer stats=Zone stats: {defaultzone=[Zone:[ ... ]; Instance count:3; Active connections count: 0; Circuit breaker tripped count: 0; Active connections per server: 0.0;]
},Server stats: [[Server:ip_address1:port; [ ... ] Total Requests:0; Successive connection failure:0; Total blackout seconds:0; [ ... ]
, [Server:ip_address2:port; [ ... ] Total Requests:0; Successive connection failure:0; Total blackout seconds:0; [ ... ]
, [Server:ip_address3:port; [ ... ] Total Requests:0; Successive connection failure:0; Total blackout seconds:0; [ ... ]