When an exception occurs in our Infinispan cluster (version 9.4.8.Final), the node that has the exception sends this information to other nodes in the cluster. This seems to be by design.
This activity can cause so much traffic that it causes timeout exceptions which, in turn, make the nodes want to communicate their timeout exceptions to the other nodes. In production, our 3 node Infinispan cluster completely saturated a 20 Gb/s link.
For example, in a 2 node QA cluster we observe the following:
Node 1:
ISPN000476: Timed out waiting for responses for request 7861 from node2
Node 2:
ISPN000217: Received exception from node1, see cause for remote stack trace
Further down the stack trace printed on node 2, we see:
Timed out waiting for responses for request 7861 from node2
There are a large number of these. We took a packet capture during this and can see that there are 50 KB packets containing a list of the remote errors along with their entire java stack traces.
When this happens, it is a "perfect storm" of sorts. Every timeout produces an error that gets sent over the network. This increases congestion and timeouts. From there things worsen extremely quickly.
I understand that I need to troubleshoot the timeout issue - looking for GC collection pauses, etc, and perhaps consider increasing the timeout. However, I'd like to know if there is a way to stop this behavior from happening when these events occur. When you think about it, it seems odd for node 1 to timeout talking to node 2 and then send node 2 a copy of the error over the network telling it 'I timed out talking to you'.
Is there any way to avoid the transmission of these remote stack traces? Many thanks in advance for any insight or advice.
EDIT
Example stack trace:
2019-12-06 11:37:01,587 ERROR [org.keycloak.services.error.KeycloakErrorHandler] (default task-26) Uncaught server error: org.infinispan.remoting.RemoteException: ISPN000217: Received exception from ********, see cause for remote stack trace
at org.infinispan.remoting.transport.ResponseCollectors.wrapRemoteException(ResponseCollectors.java:28)
at org.infinispan.remoting.transport.ValidSingleResponseCollector.withException(ValidSingleResponseCollector.java:37)
at org.infinispan.remoting.transport.ValidSingleResponseCollector.addResponse(ValidSingleResponseCollector.java:21)
at org.infinispan.remoting.transport.impl.SingleTargetRequest.receiveResponse(SingleTargetRequest.java:52)
at org.infinispan.remoting.transport.impl.SingleTargetRequest.onResponse(SingleTargetRequest.java:35)
at org.infinispan.remoting.transport.impl.RequestRepository.addResponse(RequestRepository.java:52)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processResponse(JGroupsTransport.java:1372)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1275)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.access$300(JGroupsTransport.java:126)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.up(JGroupsTransport.java:1420)
at org.jgroups.JChannel.up(JChannel.java:816)
at org.jgroups.fork.ForkProtocolStack.up(ForkProtocolStack.java:133)
at org.jgroups.stack.Protocol.up(Protocol.java:340)
at org.jgroups.protocols.FORK.up(FORK.java:141)
at org.jgroups.protocols.FRAG3.up(FRAG3.java:171)
at org.jgroups.protocols.FlowControl.up(FlowControl.java:339)
at org.jgroups.protocols.FlowControl.up(FlowControl.java:339)
at org.jgroups.protocols.pbcast.GMS.up(GMS.java:872)
at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:240)
at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1008)
at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:734)
at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:389)
at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:590)
at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:131)
at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:203)
at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:253)
at org.jgroups.protocols.MERGE3.up(MERGE3.java:280)
at org.jgroups.protocols.Discovery.up(Discovery.java:295)
at org.jgroups.protocols.TP.passMessageUp(TP.java:1249)
at org.jgroups.util.SubmitToThreadPool$SingleMessageHandler.run(SubmitToThreadPool.java:87)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at org.jboss.as.clustering.jgroups.ClassLoaderThreadFactory.lambda$newThread$0(ClassLoaderThreadFactory.java:52)
at java.lang.Thread.run(Thread.java:745)
Suppressed: org.infinispan.util.logging.TraceException
at org.infinispan.interceptors.impl.SimpleAsyncInvocationStage.get(SimpleAsyncInvocationStage.java:41)
at org.infinispan.interceptors.impl.AsyncInterceptorChainImpl.invoke(AsyncInterceptorChainImpl.java:250)
at org.infinispan.cache.impl.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1918)
at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:1433)
at org.infinispan.cache.impl.DecoratedCache.put(DecoratedCache.java:685)
at org.infinispan.cache.impl.DecoratedCache.put(DecoratedCache.java:240)
at org.infinispan.cache.impl.AbstractDelegatingCache.put(AbstractDelegatingCache.java:116)
at org.infinispan.cache.impl.AbstractDelegatingCache.put(AbstractDelegatingCache.java:116)
at org.infinispan.cache.impl.EncoderCache.put(EncoderCache.java:195)
at org.infinispan.cache.impl.AbstractDelegatingCache.put(AbstractDelegatingCache.java:116)
at org.keycloak.cluster.infinispan.InfinispanNotificationsManager.notify(InfinispanNotificationsManager.java:155)
at org.keycloak.cluster.infinispan.InfinispanClusterProvider.notify(InfinispanClusterProvider.java:130)
at org.keycloak.models.cache.infinispan.CacheManager.sendInvalidationEvents(CacheManager.java:206)
at org.keycloak.models.cache.infinispan.UserCacheSession.runInvalidations(UserCacheSession.java:140)
at org.keycloak.models.cache.infinispan.UserCacheSession$1.commit(UserCacheSession.java:152)
at org.keycloak.services.DefaultKeycloakTransactionManager.commit(DefaultKeycloakTransactionManager.java:146)
at org.keycloak.services.resources.admin.UsersResource.createUser(UsersResource.java:125)
at sun.reflect.GeneratedMethodAccessor487.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.jboss.resteasy.core.MethodInjectorImpl.invoke(MethodInjectorImpl.java:139)
at org.jboss.resteasy.core.ResourceMethodInvoker.internalInvokeOnTarget(ResourceMethodInvoker.java:510)
at org.jboss.resteasy.core.ResourceMethodInvoker.invokeOnTargetAfterFilter(ResourceMethodInvoker.java:400)
at org.jboss.resteasy.core.ResourceMethodInvoker.lambda$invokeOnTarget$0(ResourceMethodInvoker.java:364)
at org.jboss.resteasy.core.interception.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:355)
at org.jboss.resteasy.core.ResourceMethodInvoker.invokeOnTarget(ResourceMethodInvoker.java:366)
at org.jboss.resteasy.core.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:338)
at org.jboss.resteasy.core.ResourceLocatorInvoker.invokeOnTargetObject(ResourceLocatorInvoker.java:137)
at org.jboss.resteasy.core.ResourceLocatorInvoker.invoke(ResourceLocatorInvoker.java:106)
at org.jboss.resteasy.core.ResourceLocatorInvoker.invokeOnTargetObject(ResourceLocatorInvoker.java:132)
at org.jboss.resteasy.core.ResourceLocatorInvoker.invoke(ResourceLocatorInvoker.java:106)
at org.jboss.resteasy.core.ResourceLocatorInvoker.invokeOnTargetObject(ResourceLocatorInvoker.java:132)
at org.jboss.resteasy.core.ResourceLocatorInvoker.invoke(ResourceLocatorInvoker.java:100)
at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:439)
at org.jboss.resteasy.core.SynchronousDispatcher.lambda$invoke$4(SynchronousDispatcher.java:229)
at org.jboss.resteasy.core.SynchronousDispatcher.lambda$preprocess$0(SynchronousDispatcher.java:135)
at org.jboss.resteasy.core.interception.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:355)
at org.jboss.resteasy.core.SynchronousDispatcher.preprocess(SynchronousDispatcher.java:138)
at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:215)
at org.jboss.resteasy.plugins.server.servlet.ServletContainerDispatcher.service(ServletContainerDispatcher.java:227)
at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:56)
at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:51)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:791)
at io.undertow.servlet.handlers.ServletHandler.handleRequest(ServletHandler.java:74)
at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:129)
at org.keycloak.services.filters.KeycloakSessionServletFilter.doFilter(KeycloakSessionServletFilter.java:90)
at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
at io.undertow.servlet.handlers.FilterHandler.handleRequest(FilterHandler.java:84)
at io.undertow.servlet.handlers.security.ServletSecurityRoleHandler.handleRequest(ServletSecurityRoleHandler.java:62)
at io.undertow.servlet.handlers.ServletChain$1.handleRequest(ServletChain.java:68)
at io.undertow.servlet.handlers.ServletDispatchingHandler.handleRequest(ServletDispatchingHandler.java:36)
at org.wildfly.extension.undertow.security.SecurityContextAssociationHandler.handleRequest(SecurityContextAssociationHandler.java:78)
at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
at io.undertow.servlet.handlers.security.SSLInformationAssociationHandler.handleRequest(SSLInformationAssociationHandler.java:132)
at io.undertow.servlet.handlers.security.ServletAuthenticationCallHandler.handleRequest(ServletAuthenticationCallHandler.java:57)
at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
at io.undertow.security.handlers.AbstractConfidentialityHandler.handleRequest(AbstractConfidentialityHandler.java:46)
at io.undertow.servlet.handlers.security.ServletConfidentialityConstraintHandler.handleRequest(ServletConfidentialityConstraintHandler.java:64)
at io.undertow.security.handlers.AuthenticationMechanismsHandler.handleRequest(AuthenticationMechanismsHandler.java:60)
at io.undertow.servlet.handlers.security.CachedAuthenticatedSessionHandler.handleRequest(CachedAuthenticatedSessionHandler.java:77)
at io.undertow.security.handlers.NotificationReceiverHandler.handleRequest(NotificationReceiverHandler.java:50)
at io.undertow.security.handlers.AbstractSecurityContextAssociationHandler.handleRequest(AbstractSecurityContextAssociationHandler.java:43)
at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
at org.wildfly.extension.undertow.security.jacc.JACCContextIdHandler.handleRequest(JACCContextIdHandler.java:61)
at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
at org.wildfly.extension.undertow.deployment.GlobalRequestControllerHandler.handleRequest(GlobalRequestControllerHandler.java:68)
at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
at io.undertow.servlet.handlers.ServletInitialHandler.handleFirstRequest(ServletInitialHandler.java:292)
at io.undertow.servlet.handlers.ServletInitialHandler.access$100(ServletInitialHandler.java:81)
at io.undertow.servlet.handlers.ServletInitialHandler$2.call(ServletInitialHandler.java:138)
at io.undertow.servlet.handlers.ServletInitialHandler$2.call(ServletInitialHandler.java:135)
at io.undertow.servlet.core.ServletRequestContextThreadSetupAction$1.call(ServletRequestContextThreadSetupAction.java:48)
at io.undertow.servlet.core.ContextClassLoaderSetupAction$1.call(ContextClassLoaderSetupAction.java:43)
at org.wildfly.extension.undertow.security.SecurityContextThreadSetupAction.lambda$create$0(SecurityContextThreadSetupAction.java:105)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
at io.undertow.servlet.handlers.ServletInitialHandler.dispatchRequest(ServletInitialHandler.java:272)
at io.undertow.servlet.handlers.ServletInitialHandler.access$000(ServletInitialHandler.java:81)
at io.undertow.servlet.handlers.ServletInitialHandler$1.handleRequest(ServletInitialHandler.java:104)
at io.undertow.server.Connectors.executeRootHandler(Connectors.java:364)
at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:830)
at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1982)
at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1363)
... 1 more
Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 7865 from ********
at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
... 1 more