I asked a similar question with reference to older versions of Apache Curator and Zookeeper (Apache Curator failing to reset expired session), and was recommended to upgrade. I've now updated to more recent versions of the client software and am seeing a very similar issue.
It appears that when a Zookeeper session is expired, there are some situations in which the client-side connection does not get reset. This leads to the Curator client going into an exponential back-off whereby it repeatedly attempts to use the expired session, failing each time with the exception "org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /zk/path".
Everything returns to being fine if the client service is completely restarted.
Client logging (Apache Curator 4.2.0; Apache Zookeeper 3.4.8):
2020-08-17 16:25:47.521 [main-SendThread(hostname:2181)] [INFO ] [org.apache.zookeeper.ClientCnxn ] - Client session timed out, have not heard from server in 6037ms for sessionid 0x10003f181c10087, closing socket connection and attempting reconnect
2020-08-17 16:25:50.848 [main-SendThread(hostname:2181)] [INFO ] [org.apache.zookeeper.ClientCnxn ] - Opening socket connection to server hostname:2181. Will not attempt to authenticate using SASL (unknown error)
2020-08-17 16:25:50.848 [main-SendThread(hostname:2181)] [INFO ] [org.apache.zookeeper.ClientCnxn ] - Socket connection established to hostname:2181, initiating session
2020-08-17 16:25:50.851 [main-SendThread(hostname:2181)] [DEBUG] [org.apache.zookeeper.ClientCnxn ] - Session establishment request sent on hostname:2181
2020-08-17 16:25:50.853 [main-SendThread(hostname:2181)] [TRACE] [org.apache.zookeeper.ClientCnxnSocket ] - readConnectResult 37 0x[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,]
2020-08-17 16:25:50.853 [main-SendThread(hostname:2181)] [WARN ] [org.apache.zookeeper.ClientCnxn ] - Unable to reconnect to ZooKeeper service, session 0x10003f181c10087 has expired
2020-08-17 16:25:50.853 [main-SendThread(hostname:2181)] [INFO ] [org.apache.zookeeper.ClientCnxn ] - Unable to reconnect to ZooKeeper service, session 0x10003f181c10087 has expired, closing socket connection
2020-08-17 16:25:50.962 [main-EventThread ] [DEBUG] [org.apache.curator.RetryLoop ] - Retry-able exception received
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /zk/path
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1102)
at org.apache.curator.framework.imps.ExistsBuilderImpl$3.call(ExistsBuilderImpl.java:237)
at org.apache.curator.framework.imps.ExistsBuilderImpl$3.call(ExistsBuilderImpl.java:226)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:109)
at org.apache.curator.framework.imps.ExistsBuilderImpl.pathInForegroundStandard(ExistsBuilderImpl.java:223)
at org.apache.curator.framework.imps.ExistsBuilderImpl.pathInForeground(ExistsBuilderImpl.java:216)
at org.apache.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:175)
at org.apache.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:32)
[...]
at org.apache.curator.framework.imps.NamespaceWatcher.process(NamespaceWatcher.java:68)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:505)
2020-08-17 16:25:51.054 [main-SendThread(hostname:2181)] [TRACE] [org.apache.zookeeper.ClientCnxnSocketNIO] - Doing client selector close
2020-08-17 16:25:51.055 [main-SendThread(hostname:2181)] [TRACE] [org.apache.zookeeper.ClientCnxnSocketNIO] - Closed client selector
2020-08-17 16:25:51.056 [main-SendThread(hostname:2181)] [TRACE] [org.apache.zookeeper.ClientCnxn ] - SendThread exited loop for session: 0x10003f181c10087
2020-08-17 16:25:53.991 [main-EventThread ] [TRACE] [o.a.curator.utils.DefaultTracerDriver ] - Counter retries-allowed: 1
2020-08-17 16:25:53.991 [main-EventThread ] [DEBUG] [org.apache.curator.RetryLoop ] - Retrying operation
2020-08-17 16:25:53.992 [main-EventThread ] [DEBUG] [org.apache.curator.RetryLoop ] - Retry-able exception received
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /zk/path
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1102)
at org.apache.curator.framework.imps.ExistsBuilderImpl$3.call(ExistsBuilderImpl.java:237)
at org.apache.curator.framework.imps.ExistsBuilderImpl$3.call(ExistsBuilderImpl.java:226)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:109)
at org.apache.curator.framework.imps.ExistsBuilderImpl.pathInForegroundStandard(ExistsBuilderImpl.java:223)
at org.apache.curator.framework.imps.ExistsBuilderImpl.pathInForeground(ExistsBuilderImpl.java:216)
at org.apache.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:175)
at org.apache.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:32)
[...]
at org.apache.curator.framework.imps.NamespaceWatcher.process(NamespaceWatcher.java:68)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:505)
Server logging (Zookeeper 3.4.13)
2020-08-17 16:25:47,519 - INFO [SessionTracker:ZooKeeperServer@355] - Expiring session 0x10003f181c10087, timeout of 4000ms exceeded
2020-08-17 16:25:47,521 - INFO [ProcessThread(sid:0 cport:2181)::PrepRequestProcessor@487] - Processed session termination for sessionid: 0x10003f181c10087
2020-08-17 16:25:47,522 - INFO [SyncThread:0:NIOServerCnxn@1056] - Closed socket connection for client which had sessionid 0x10003f181c10087
2020-08-17 16:25:50,849 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@215] - Accepted socket connection from
2020-08-17 16:25:50,851 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@942] - Client attempting to renew session 0x10003f181c10087 at
2020-08-17 16:25:50,852 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@687] - Invalid session 0x10003f181c10087 for client, probably expired
2020-08-17 16:25:50,852 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1056] - Closed socket connection for client which had sessionid 0x10003f181c10087
There is no server logging corresponding to the "KeeperErrorCode = Session expired" client-side errors.
Would really appreciate any insight into this issue. Thanks!