Cassandra : Changing the number of retry Attempts and the retry Delay

Question

I am using a client to write to cassandra (api : com.datastax.driver.core ) If I bring down the cassandra clustures after the connection has been done. I get the following error in my logs

2015-11-05 12:08:21,667 ERROR [Reconnection-1] com.datastax.driver.core.ControlConnection - [Control connection] Cannot connect to any host, scheduling retry in 1000 milliseconds
.
.
.
2015-11-05 14:15:24,847 DEBUG [Reconnection-0] com.datastax.driver.core.Connection - Connection[/10.75.43.251:9042-24, inFlight=0, closed=false] Error connecting to /10.75.43.251:9042 (Connection refused: /10.75.43.251:9042)
2015-11-05 14:15:24,847 DEBUG [Reconnection-0] com.datastax.driver.core.Connection - Defuncting connection to /10.75.43.251:9042
com.datastax.driver.core.TransportException: [/10.75.43.251:9042] Cannot connect
        at com.datastax.driver.core.Connection.<init>(Connection.java:104)
        at com.datastax.driver.core.Connection$Factory.open(Connection.java:544)
        at com.datastax.driver.core.Cluster$Manager$5.tryReconnect(Cluster.java:1652)
        at com.datastax.driver.core.AbstractReconnectionHandler.run(AbstractReconnectionHandler.java:124)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: /10.75.43.251:9042
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
        at com.datastax.shaded.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:150)
        at com.datastax.shaded.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
        at com.datastax.shaded.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
        at com.datastax.shaded.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
        at com.datastax.shaded.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
        at com.datastax.shaded.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at com.datastax.shaded.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        ... 3 more


2015-11-05 14:15:24,847 DEBUG [New I/O worker #8] com.datastax.driver.core.Connection - Connection[/10.75.43.251:9042-24, inFlight=0, closed=true] closing connection
2015-11-05 14:15:24,847 DEBUG [New I/O boss #9] com.datastax.driver.core.Connection - Connection[/10.75.43.251:9042-24, inFlight=0, closed=false] connection error
java.net.ConnectException: Connection refused: /10.75.43.251:9042
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
        at com.datastax.shaded.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:150)
        at com.datastax.shaded.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
        at com.datastax.shaded.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
        at com.datastax.shaded.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
        at com.datastax.shaded.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
        at com.datastax.shaded.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
        at com.datastax.shaded.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015-11-05 14:15:24,849 DEBUG [Reconnection-0] com.datastax.driver.core.Cluster - Failed reconnection to /10.75.43.251:9042 ([/10.75.43.251:9042] Cannot connect), scheduling retry in 600000 milliseconds
2015-11-05 14:15:24,849 DEBUG [Cassandra Java Driver worker-44] com.datastax.driver.core.Cluster - Host /10.75.43.251:9042 is DOWN
2015-11-05 14:15:24,849 DEBUG [Cassandra Java Driver worker-44] com.datastax.driver.core.Cluster - Aborting onDown because a reconnection is running on DOWN host /10.75.43.251:9042

I tried setting the ReconnectionPolicy. which gives me control over the retry Delay. But the retry Attempts( which say I want 3) is still not under my control.

I tried ConstantReconnectPolicy( which gives only the reconnectDelay to be provided, which worked. But I as well want the retry Attempts to be controlled. I am trying something like

  private volatile int currentRetryCount;

    class MyReconnectionPolicy implements ReconnectionPolicy {

        @Override
        public ReconnectionSchedule newSchedule() {
            return new MyReconnectionSchedule();

        }
    }

    class MyReconnectionSchedule implements ReconnectionSchedule {

        @Override
        public long nextDelayMs() {
            if (++currentRetryCount < maxReconnectAttempts) {
            return retryIntervalInMilliSec;
            } else {
                // try {
                throw new Error("Exception Occurred. Retry limits exhausted.");
                // } catch (Exception e) {
                // logger.error("Exception Occurred!");
                // return Long.MAX_VALUE;
                // }
            }
        }

    }

This also doesn't help a lot. The exception is not propagated to the main program.. As it does not throw exception.

What could be the possible api(if exposed) or open bug to this(if any already, couldnt find).

Thanks!

score 0 · Answer 1 · answered Nov 05 '15 at 13:25

0

Returning Long.MAX_VALUE schedules the next reconnection attempt far away in the future, which is essentially the same as canceling reconnections. Although I would be careful with that, because you could end up losing connectivity with all your nodes.

answered Nov 05 '15 at 13:25

Olivier Michallat

2,302
11
13

I am trying to close the the connection. But the method nextDelayMs() does not allow me to throw a exception, hence in the catch I was returning Long.MAX_VAL. – nandini Nov 06 '15 at 06:32
Don't throw an exception, return the value directly. – Olivier Michallat Nov 06 '15 at 17:44

score 0 · Answer 2 · edited Feb 23 '18 at 01:51

I solved this like:

private class CustomExponentialSchedule implements ReconnectionSchedule {

    private int attempts;

    @Override
    public long nextDelayMs() {

        // If totalReconnectionCount is zero, the application won't be never stopped.
        if (totalReconnectionCount != 0 && attempts == totalReconnectionCount) {
            // Kill the Java process.
            System.exit(1);
        }

        if (attempts > maxAttempts) {
            return maxDelayMs;
        }

        return Math.min(baseDelayMs * (1L << attempts++), maxDelayMs);
    }
}

Cassandra : Changing the number of retry Attempts and the retry Delay

2 Answers2