0

I am using Ribbon without Eureka.I am using a ConfigurationBasedServerList to provide the list of server instances like so..

customerinfo.ribbon.listOfServers=localhost:9003,localhost:9008

I have configured PingURL with the /health endpoint. I have also configured AvailabilityFilteringRule which needs to filter the sever instances that are not available. like so..

public class RibbonConfig {

@Autowired
IClientConfig ribbonClientConfig;

@Bean
public IPing ribbonPing(IClientConfig config) {
    return new PingUrl(true, "/health");
}

@Bean
public IRule ribbonRule(IClientConfig config) {
    return new AvailabilityFilteringRule();
}

}

This mostly works well. It doesn't work well in one case. Thats the case when the server instance running on port 9008 is down.

Let me explain with some DEBUG messages.

DEBUG com.netflix.loadbalancer.DynamicServerListLoadBalancer - List of Servers for customerinfo obtained from Discovery client: [localhost:9003, localhost:9008]
DEBUG com.netflix.loadbalancer.DynamicServerListLoadBalancer - Filtered List of Servers for customerinfo obtained from Discovery client: [localhost:9003, localhost:9008]
DEBUG com.netflix.loadbalancer.BaseLoadBalancer - LoadBalancer:  clearing server list (SET op)
DEBUG com.netflix.loadbalancer.BaseLoadBalancer - LoadBalancer:  addServer [localhost:9003]
DEBUG com.netflix.loadbalancer.BaseLoadBalancer - LoadBalancer:  addServer [localhost:9008]
com.netflix.loadbalancer.DynamicServerListLoadBalancer - Setting server list for zones: {unknown=[localhost:9003, localhost:9008]}
DEBUG com.netflix.loadbalancer.BaseLoadBalancer - LoadBalancer:  clearing server list (SET op)
DEBUG com.netflix.loadbalancer.BaseLoadBalancer - LoadBalancer:  addServer [localhost:9003]
DEBUG com.netflix.loadbalancer.BaseLoadBalancer - LoadBalancer:  addServer [localhost:9008]
DEBUG com.netflix.loadbalancer.BaseLoadBalancer - LoadBalancer:  forceQuickPing invoked
DEBUG com.netflix.loadbalancer.BaseLoadBalancer - LoadBalancer:  PingTask executing [2] servers configured
DEBUG com.netflix.loadbalancer.BaseLoadBalancer - LoadBalancer:  Server [localhost:9008] status changed to DEAD

Looking at the DEBUG messages. The process thats being followed looks like this: 1) Clear the server list and add the servers from the config again. 2) Ping them for their status. 3) Update the available server list depending on the ping results.

Every 30 secs the above process seems to be happening which is to maintain the DynamicServerList.

Now, the problem is - from the first log statement to the penultimate log statement, ribbon thinks both the server instances are available. So, if there is load balancing request that comes within that time, then there is a chance that its send to the server localhost:9008 which is DOWN.

From my understanding Ribbon library does not keep PingStatistics. I think the library depends on Service Discovery tools like Eureka to provide the DynamicServerlist which are healthy depending on some health checks.

Now, to fix this problem, I can start using Eureka and this problem might vanish. I don't want to use Eureka as my environment doesnt grow/shrink often...its pretty much static.

Is there a config that i am missing here? How do we solve this issue?

I am using "spring-cloud-starter-ribbon" Version 1.2.6.RELEASE.

Amith M
  • 61
  • 1
  • 7
  • Is this problem just happen before the first ping command completed? – Azarea Oct 18 '17 at 02:33
  • @Sakura Kyouko ..no its not limited to - before the first ping command completed. – Amith M Oct 23 '17 at 09:36
  • the problem I think here is that the circuit breaker is engaged only after few connection failures. The circuit breaker feature is not provided for the PingUrl component. I think eureka and other tools will remove the server from the list when health checks starts failing. This bit is not available in Ribbon library when not using service discovery tools. – Amith M Oct 23 '17 at 09:42

1 Answers1

0

All the available IRule implementations do not use reachableServers correctly, we have to implement a new IRule.

@Slf4j
public class LoadBalanceConfig {

    @Bean
    public IClientConfig ribbonClientConfig() {
        DefaultClientConfigImpl config = new DefaultClientConfigImpl();
        config.set(IClientConfigKey.Keys.IsSecure, false);
        config.set(IClientConfigKey.Keys.ListOfServers, XXXXXXX);
        config.set(IClientConfigKey.Keys.ServerListRefreshInterval, 3000);
        return config;
    }

    @Bean
    public ServerList<Server> ribbonServerList(IClientConfig clientConfig) {
        AbstractServerList<Server> lst = new ConfigurationBasedServerList();
        lst.initWithNiwsConfig(clientConfig);
        return lst;
    }

    @Bean
    public ServerListFilter<Server> ribbonServerListFilter() {
        return new AbstractServerListFilter<Server>() {
            @Override
            public List<Server> getFilteredListOfServers(List<Server> servers) {
                return servers;
            }
        };
    }

    // modified from com.netflix.loadbalancer.RoundRobinRule
    public static class RoundRobinRule implements IRule {
        private ILoadBalancer lb;
        private AtomicInteger nextServerCyclicCounter = new AtomicInteger(0);

        @Override
        public void setLoadBalancer(ILoadBalancer lb) {
            this.lb = lb;
        }

        @Override
        public ILoadBalancer getLoadBalancer() {
            return lb;
        }

        @Override
        public Server choose(Object key) {
            ILoadBalancer lb = getLoadBalancer();
            if (lb == null) {
                log.warn("no load balancer");
                return null;
            }

            List<Server> reachableServers = lb.getReachableServers();
            int upCount = reachableServers.size();
            if (upCount == 0) {
                log.warn("No up servers available from load balancer: " + lb);
                return null;
            }
            int nextServerIndex = incrementAndGetModulo(upCount);
            return reachableServers.get(nextServerIndex);
        }

        private int incrementAndGetModulo(int modulo) {
            for (;;) {
                int current = nextServerCyclicCounter.get();
                int next = (current + 1) % modulo;
                if (nextServerCyclicCounter.compareAndSet(current, next)) {
                    return next;
                }
            }
        }

    }

    @Bean
    public IRule ribbonRule() {
        return new RoundRobinRule();
    }

    @Bean
    public IPing ribbonPing() {
        PingUrl ping = new PingUrl(false, "/XXXactive_detect");
        ping.setExpectedContent("{\"status\":\"OK\"}");
        return ping;
    }

    @Bean
    public ILoadBalancer ribbonLoadBalancer(IClientConfig clientConfig, IRule rule, IPing ping,
            ServerList<Server> serverList, ServerListFilter<Server> filter, ServerListUpdater serverListUpdater) {
        DynamicServerListLoadBalancer<Server> loadBalancer = new DynamicServerListLoadBalancer<>(clientConfig, rule,
                ping, serverList, filter, serverListUpdater);
        return loadBalancer;
    }

    @Bean
    public ServerListUpdater ribbonServerListUpdater(IClientConfig clientConfig) {
        return new PollingServerListUpdater(clientConfig);
    }
}

Roc King
  • 431
  • 6
  • 18