0

We are using vnodes on a 8 nodes datacenter. One of the node keeps moving it's token range, and when doing so provokes time out errors from the connected client.

Here is what we see in OpsCenter events:

4/13/2016, 10:51am  Info    Host 172.31.34.155 moved from '-1108852503760494577' to '8185241953623605265'   ip-172-31-34-155    
4/13/2016, 10:46am  Info    Host 172.31.34.155 moved from '8185241953623605265' to '-1108852503760494577'   ip-172-31-34-155    
4/13/2016, 10:43am  Info    Host 172.31.34.155 moved from '-1108852503760494577' to '8185241953623605265'   ip-172-31-34-155    
4/13/2016, 10:27am  Info    Host 172.31.34.155 moved from '-1108852503760494577' to '8185241953623605265'   ip-172-31-34-155    
4/13/2016, 10:15am  Info    Host 172.31.34.155 moved from '8185241953623605265' to '-1108852503760494577'   ip-172-31-34-155    
4/13/2016, 10:06am  Info    Host 172.31.34.155 moved from '8185241953623605265' to '-1108852503760494577'   ip-172-31-34-155    
4/13/2016, 9:35am   Info    Host 172.31.34.155 moved from '-1108852503760494577' to '8185241953623605265'   ip-172-31-34-155    
4/13/2016, 9:33am   Info    Host 172.31.34.155 moved from '8185241953623605265' to '-1108852503760494577'   ip-172-31-34-155    
4/13/2016, 9:23am   Info    Host 172.31.34.155 moved from '-1108852503760494577' to '8185241953623605265'   ip-172-31-34-155    
4/13/2016, 9:18am   Info    Host 172.31.34.155 moved from '8185241953623605265' to '-1108852503760494577'   ip-172-31-34-155    
4/13/2016, 8:44am   Info    Host 172.31.34.155 moved from '-1108852503760494577' to '8185241953623605265'   ip-172-31-34-155    
4/13/2016, 8:36am   Info    Host 172.31.34.155 moved from '8185241953623605265' to '-1108852503760494577'   ip-172-31-34-155    
4/13/2016, 8:34am   Info    Host 172.31.34.155 moved from '-1108852503760494577' to '8185241953623605265'   ip-172-31-34-155    
4/13/2016, 8:13am   Info    Host 172.31.34.155 moved from '-1108852503760494577' to '8185241953623605265'   ip-172-31-34-155    
4/13/2016, 7:22am   Info    Host 172.31.34.155 moved from '-1108852503760494577' to '8185241953623605265'   ip-172-31-34-155    
4/13/2016, 6:54am   Info    Host 172.31.34.155 moved from '8185241953623605265' to '-1108852503760494577'   ip-172-31-34-155    
4/13/2016, 6:37am   Info    Host 172.31.34.155 moved from '-1108852503760494577' to '8185241953623605265'   ip-172-31-34-155    
4/13/2016, 6:27am   Info    Host 172.31.34.155 moved from '8185241953623605265' to '-1108852503760494577'   ip-172-31-34-155    
4/13/2016, 6:24am   Info    Host 172.31.34.155 moved from '-1108852503760494577' to '8185241953623605265'   ip-172-31-34-155    
4/13/2016, 6:16am   Info    Host 172.31.34.155 moved from '8185241953623605265' to '-1108852503760494577'   ip-172-31-34-155    
4/13/2016, 6:15am   Info    Host 172.31.34.155 moved from '-1108852503760494577' to '8185241953623605265'   ip-172-31-34-155

Any idea what could cause this ? We tried running a full repair on the cluster but this behavior keeps ocuring on this particular node.

Obviously all node have the same configuration

We are using DSE 4.8.6

Vincent de Lagabbe
  • 4,964
  • 3
  • 31
  • 38
  • Can you check in `system.log` file of the moving node to see if there are any error/exception ? – doanduyhai Apr 13 '16 at 09:22
  • Lots of "com.google.common.util.concurrent.UncheckedExecutionException: java.lang.RuntimeException: org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - received only 0 responses." on the system.log of this log when this happens. No other errors/warnings – Vincent de Lagabbe Apr 13 '16 at 10:31
  • can you post the content of `system.log` somewhere ? It seems that there are issue of this moving node reading data – doanduyhai Apr 13 '16 at 10:54
  • It just stopped happening. Still no idea what happened originally. Maybe fixed by a repair ? Thanks anyway for the heads up – Vincent de Lagabbe Apr 19 '16 at 14:13

0 Answers0