We have WSO2 API Manager 2.2.0 deployed in Production.
Our architecture: We have an API Gateway instance deployed in the DMZ. This API Gateway instance forwards all valid API requests to an internal 'all-in-one' WSO2 API Manager instance via a load balancer (in the event that the Prod instance is unavailable, requests are routed to a DR instance). The all-in-one instance uses Oracle databases for API, Registry, User Mgmt, Message Broker and Stats.
We have an error in the logs that occurs every hour. The impact of this error appears to be that the API Manager cannot communicate with endpoints for around 1 minute until the problem appears to resolve itself. Therefore WSO2 will respond to cients with an 'Error in Sender' message, and the API will be suspended for 30 seconds.
Can anyone suggest what the cause of this issue may be?
Error that occurs every hour:
TID: [-1] [] [2019-01-30 01:00:12,156] WARN {org.wso2.andes.client.protocol.AMQProtocolHandler} - Timed out while waiting for heartbeat from peer. {org.wso2.andes.client.protocol.AMQProtocolHandler}
TID: [-1] [] [2019-01-30 01:00:12,272] ERROR {org.wso2.andes.transport.network.mina.MinaNetworkHandler} - Exception caught by Mina {org.wso2.andes.transport.network.mina.MinaNetworkHandler}
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at org.apache.mina.transport.socket.nio.SocketIoProcessor.read(SocketIoProcessor.java:218)
at org.apache.mina.transport.socket.nio.SocketIoProcessor.process(SocketIoProcessor.java:198)
at org.apache.mina.transport.socket.nio.SocketIoProcessor.access$400(SocketIoProcessor.java:45)
at org.apache.mina.transport.socket.nio.SocketIoProcessor$Worker.run(SocketIoProcessor.java:485)
at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:51)
at java.lang.Thread.run(Thread.java:748)
TID: [-1] [] [2019-01-30 01:00:12,280] ERROR {org.wso2.andes.transport.network.mina.MinaNetworkHandler} - Exception caught by Mina {org.wso2.andes.transport.network.mina.MinaNetworkHandler}
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at org.apache.mina.transport.socket.nio.SocketIoProcessor.read(SocketIoProcessor.java:218)
at org.apache.mina.transport.socket.nio.SocketIoProcessor.process(SocketIoProcessor.java:198)
at org.apache.mina.transport.socket.nio.SocketIoProcessor.access$400(SocketIoProcessor.java:45)
at org.apache.mina.transport.socket.nio.SocketIoProcessor$Worker.run(SocketIoProcessor.java:485)
at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:51)
at java.lang.Thread.run(Thread.java:748)
TID: [-1] [] [2019-01-30 01:00:12,519] ERROR {org.wso2.andes.server.protocol.AMQProtocolEngine} - IOException caught in/<ip-address>:49069(user), session closed implictly: java.io.IOException: Connection reset by peer {org.wso2.andes.server.protocol.AMQProtocolEngine}
TID: [-1] [] [2019-01-30 01:00:12,520] INFO {org.wso2.andes.server.AMQChannel} - Unsubscribing all consumers on channel [/<ip-address>:49069(user):1] {org.wso2.andes.server.AMQChannel}
TID: [-1] [] [2019-01-30 01:00:12,520] INFO {org.wso2.andes.server.AMQChannel} - Unsubscribing consumer '246' on channel [/<ip-address>:49069(user):1] {org.wso2.andes.server.AMQChannel}
TID: [-1] [] [2019-01-30 01:00:12,456] ERROR {org.wso2.andes.server.protocol.AMQProtocolEngine} - IOException caught in/<ip-address>:7076(user), session closed implictly: java.io.IOException: Connection reset by peer {org.wso2.andes.server.protocol.AMQProtocolEngine}
TID: [-1] [] [2019-01-30 01:00:12,522] INFO {org.wso2.andes.server.AMQChannel} - Unsubscribing all consumers on channel [/<ip-address>:7076(user):1] {org.wso2.andes.server.AMQChannel}
TID: [-1] [] [2019-01-30 01:00:12,526] INFO {org.wso2.andes.server.AMQChannel} - Unsubscribing consumer '44' on channel [/<ip-address>:7076(user):1] {org.wso2.andes.server.AMQChannel}
TID: [-1] [] [2019-01-30 01:00:12,530] ERROR {org.wso2.andes.client.state.AMQStateManager} - No Waiters for error saving as last error:Exception thrown against AMQConnection:
Host: <ip-address>
Port: 5672
Virtual Host: carbon
Client ID: clientid
Active session count: 1: org.wso2.andes.AMQDisconnectedException: Server closed connection and reconnection not permitted. {org.wso2.andes.client.state.AMQStateManager}
TID: [-1] [] [2019-01-30 01:00:12,533] ERROR {org.wso2.carbon.apimgt.jms.listener.utils.JMSTaskManager} - JMS Connection failed : Exception thrown against AMQConnection:
Host: <ip-address>
Port: 5672
Virtual Host: carbon
Client ID: clientid
Active session count: 1: org.wso2.andes.AMQDisconnectedException: Server closed connection and reconnection not permitted. - shutting down worker tasks {org.wso2.carbon.apimgt.jms.listener.utils.JMSTaskManager}
TID: [-1] [] [2019-01-30 01:00:12,534] INFO {org.wso2.carbon.apimgt.jms.listener.utils.JMSTaskManager} - Reconnection attempt : 1 for Siddhi-JMS-Consumer {org.wso2.carbon.apimgt.jms.listener.utils.JMSTaskManager}
TID: [-1] [] [2019-01-30 01:00:12,536] INFO {org.wso2.andes.server.AMQChannel} - Unsubscribing all consumers on channel [/<ip-address>:29986(user):1] {org.wso2.andes.server.AMQChannel}
TID: [-1] [] [2019-01-30 01:00:12,549] INFO {org.wso2.andes.server.AMQChannel} - Unsubscribing consumer '1' on channel [/<ip-address>:29986(user):1] {org.wso2.andes.server.AMQChannel}
TID: [-1] [] [2019-01-30 01:00:12,552] INFO {org.wso2.andes.subscription.SubscriptionEngine} - Local Subscription DELETED [throttleData]ID=386@NODE<server>/<ip-address>/T=1548808789236/D=false/X=true/O=clientid/E=amq.topic/ET=org.wso2.andes.server.exchange.TopicExchange$1@6b424af0/EUD=0/S=false {org.wso2.andes.subscription.SubscriptionEngine}
TID: [-1] [] [2019-01-30 01:00:12,557] INFO {org.wso2.andes.subscription.SubscriptionEngine} - Local Subscription DELETED [throttleData]ID=385@<server>/<ip-address>/T=1548808788762/D=false/X=true/O=clientid/E=amq.topic/ET=org.wso2.andes.server.exchange.TopicExchange$1@6b424af0/EUD=0/S=false {org.wso2.andes.subscription.SubscriptionEngine}
TID: [-1] [] [2019-01-30 01:00:12,557] INFO {org.wso2.andes.kernel.OrphanedMessageHandler} - Purging messages of this node persisted under throttleData {org.wso2.andes.kernel.OrphanedMessageHandler}
TID: [-1] [] [2019-01-30 01:00:12,773] INFO {org.wso2.andes.kernel.MessagingEngine} - Purged messages of destination throttleData {org.wso2.andes.kernel.MessagingEngine}
TID: [-1] [] [2019-01-30 01:00:12,780] INFO {org.wso2.andes.subscription.SubscriptionEngine} - Local Subscription DELETED [throttleData]ID=384@<server>/<ip-address>/T=1548806362449/D=false/X=true/O=clientid/E=amq.topic/ET=org.wso2.andes.server.exchange.TopicExchange$1@6b424af0/EUD=0/S=false {org.wso2.andes.subscription.SubscriptionEngine}
TID: [-1] [] [2019-01-30 01:00:12,782] INFO {org.wso2.andes.kernel.OrphanedMessageHandler} - Purging messages of this node persisted under throttleData {org.wso2.andes.kernel.OrphanedMessageHandler}
TID: [-1] [] [2019-01-30 01:00:12,791] INFO {org.wso2.andes.kernel.MessagingEngine} - Purged messages of destination throttleData {org.wso2.andes.kernel.MessagingEngine}
TID: [-1] [] [2019-01-30 01:00:13,517] INFO {org.wso2.andes.kernel.FlowControlManager} - Channel removed (ID: <ip-address>:49069) {org.wso2.andes.kernel.FlowControlManager}
TID: [-1] [] [2019-01-30 01:00:13,751] INFO {org.wso2.andes.kernel.FlowControlManager} - Channel removed (ID: <ip-address>:29986) {org.wso2.andes.kernel.FlowControlManager}
TID: [-1] [] [2019-01-30 01:00:13,926] INFO {org.wso2.andes.kernel.FlowControlManager} - Channel removed (ID: <ip-address>:7076) {org.wso2.andes.kernel.FlowControlManager}
TID: [-1] [] [2019-01-30 01:00:17,607] WARN {org.wso2.carbon.apimgt.jms.listener.utils.JMSTaskManager} - Unable to shutdown all polling tasks of Siddhi-JMS-Consumer {org.wso2.carbon.apimgt.jms.listener.utils.JMSTaskManager}
TID: [-1] [] [2019-01-30 01:00:17,609] INFO {org.wso2.carbon.apimgt.jms.listener.utils.JMSTaskManager} - Task manager for jms consumer 1000 shutdown {org.wso2.carbon.apimgt.jms.listener.utils.JMSTaskManager}
TID: [-1] [] [2019-01-30 01:00:17,612] INFO {org.wso2.carbon.apimgt.jms.listener.utils.JMSTaskManager} - Task manager for Siddhi-JMS-Consumer [re-]initialized {org.wso2.carbon.apimgt.jms.listener.utils.JMSTaskManager}
TID: [-1234] [] [2019-01-30 01:00:17,760] INFO {org.wso2.andes.server.handler.ConnectionStartOkMethodHandler} - SASL Mechanism selected: PLAIN {org.wso2.andes.server.handler.ConnectionStartOkMethodHandler}
TID: [-1234] [] [2019-01-30 01:00:17,760] INFO {org.wso2.andes.server.handler.ConnectionStartOkMethodHandler} - Locale selected: en_US {org.wso2.andes.server.handler.ConnectionStartOkMethodHandler}
TID: [-1234] [] [2019-01-30 01:00:17,782] INFO {org.wso2.andes.server.handler.ConnectionStartOkMethodHandler} - Connected as: user{org.wso2.andes.server.handler.ConnectionStartOkMethodHandler}
TID: [-1234] [] [2019-01-30 01:00:17,783] INFO {org.wso2.andes.server.handler.ConnectionStartOkMethodHandler} - Framesize set to 65535 {org.wso2.andes.server.handler.ConnectionStartOkMethodHandler}
TID: [-1234] [] [2019-01-30 01:00:17,873] INFO {org.wso2.andes.server.handler.ChannelOpenHandler} - Connecting to: carbon {org.wso2.andes.server.handler.ChannelOpenHandler}
TID: [-1234] [] [2019-01-30 01:00:17,875] INFO {org.wso2.andes.kernel.AndesChannel} - Channel created (ID: <ip-address>:31136) {org.wso2.andes.kernel.AndesChannel}
TID: [-1234] [] [2019-01-30 01:00:17,957] INFO {org.wso2.andes.server.handler.QueueDeclareHandler} - Queue tmp_ip-address_31136_1 bound to default exchange(<<default>>) {org.wso2.andes.server.handler.QueueDeclareHandler}
TID: [-1234] [] [2019-01-30 01:00:17,958] INFO {org.wso2.andes.server.handler.QueueDeclareHandler} - Queue tmp_ip-address_31136_1 declared successfully {org.wso2.andes.server.handler.QueueDeclareHandler}
TID: [-1234] [] [2019-01-30 01:00:18,074] INFO {org.wso2.andes.server.handler.QueueBindHandler} - Binding queue tmp_ip-address_31136_1 to exchange TopicExchange[amq.topic] with routing key throttleData {org.wso2.andes.server.handler.QueueBindHandler}
TID: [-1] [] [2019-01-30 01:00:18,615] INFO {org.wso2.carbon.apimgt.jms.listener.utils.JMSTaskManager} - Reconnection attempt: 1 for Siddhi-JMS-Consumer was successful! {org.wso2.carbon.apimgt.jms.listener.utils.JMSTaskManager}
TID: [-1] [] [2019-01-30 01:00:20,428] INFO {org.wso2.andes.subscription.SubscriptionEngine} - Local subscription ADDED [throttleData]ID=387@<server>/<ip-address>/T=1548810020424/D=false/X=true/O=clientid/E=amq.topic/ET=org.wso2.andes.server.exchange.TopicExchange$1@6b424af0/EUD=0/S=true {org.wso2.andes.subscription.SubscriptionEngine}
TID: [-1] [] [2019-01-30 01:05:28,619] INFO {org.wso2.andes.kernel.AndesRecoveryTask} - Running DB sync task. {org.wso2.andes.kernel.AndesRecoveryTask}