21

We have 4 ActiveMQ brokers (each running on a separate server) setup in a network of brokers. There about 60 producers. The producers lookup the ActiveMQ connection factory from Glassfish using JDNI.

The ActiveMQ URI configured in Glassfish is as follows:

failover:(tcp://phxgapm01:61616,tcp://phxgapm02:61616,tcp://phxgapm03:61616,tcp://phxgapm04:61616)?randomize=true&backup=false&maxReconnectAttempts=8

Each producer process does a JNDI lookup of the javax.jms.ConnectionFactory and then creates 1 javax.jms.Connection. As the producer runs it will periodically create a javax.jms.Session and javax.jms.MessageProducer, send some messages to a queue and then close the Session and MessageProducer.

That is all background - now to my question. From some, but not all of the producers, we'll see a stream of log output like the following:

2014-12-30 21:07:06,534 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm03:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,538 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm04:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,544 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm02:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,548 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm04:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,552 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm01:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,556 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm02:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,561 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm02:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,565 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm01:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,568 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm02:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,572 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm04:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,577 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm03:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,581 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm04:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,586 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm01:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,590 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm03:61616 - [ActiveMQ Task-1]
2014-12-30 21:07:06,594 INFO  FailoverTransport    - Successfully connected to tcp://phxgapm04:61616 - [ActiveMQ Task-1]

For some of the producers we'll see this output every 10 minutes - for others it is less frequent. Even more confusing is that all of these producers use identical code for the JMS messaging - so while producers may vary in the how frequently they create sessions and messageproducers, they all use the same code and all create just 1 connection object.

From reading the documentation, my understanding is that the failover transport would open a connection to 1 of the brokers (selecting randomly in our case). Why do we see this stream of connections (multiple connections to each of the brokers within 60ms)? Using netstat we can see these connections. Is this normal? If not, any suggestions as to what might be causing this?

Vizllx
  • 9,135
  • 1
  • 41
  • 79
sceaj
  • 1,573
  • 3
  • 12
  • 24
  • Is the code using straight JMS or JMSTemplate etc. examples are helpful. Is there a PooledConnectionFactory in use. – Tim Bish Dec 31 '14 at 15:56
  • It is straight JMS - There is no PooledConnectionFactory being used (at least not directly) – sceaj Dec 31 '14 at 16:24
  • If there's an XA Connection Factory then you need to use the PooledConnectionFactory, otherwise it will disconnect / reconnect all the time. You can see the connection count grow in one of the management topics (cant remember which one) – stringy05 Feb 18 '15 at 04:06
  • if it's not idleTimeout, what about firewall? – MarianP Apr 29 '15 at 13:16
  • you should turn on debug logging in amq – MarianP Apr 29 '15 at 14:31
  • There are no firewalls between the AMQ machine and any clients. (all the relevant machines are in the same network segment). – sceaj Apr 29 '15 at 17:36
  • I can increase the logging, but bear in mind, this doesn't appear to cause any problems per se, I'm just trying to understand if this behavior is expected and "normal", and if so, why it behaves this way? Is it due to some configuration. The documentation leads me to believe that only 1 connection would be established. Based on the number of up-votes the question has received, it appears others are experiencing this (and are confused by it) as well. – sceaj Apr 29 '15 at 17:41
  • What is the idleTimeout of the PooledConnectionFactory set to? The default is 30 seconds. You can try increase that *if it takes the application more than 30 seconds to cycle through the 5 connections, you'll start observing connection churn* – Kenneth Clark Jun 09 '15 at 05:32
  • Are you opening and closing the connection each time a message is produced? If yes, why? That could be the reason. Did you tried with Connection pools? – Rajib Deka Jul 01 '15 at 04:10

1 Answers1

1

Without having the activeMQ loglevels raised there is some room to speculation, but:

  • "for others it is less frequent" - Observing different behaviour on different instances in this case is completely natural. If the load is not perfectly distributed between instances, they will behave differently when it comes to messaging. Just imagine one of your nodes picking up 90% of the triggering inputs and doing most of the work alone. That node would be under much higher load and would use its JMS resources totally differently from the rest of the nodes.
  • "my understanding is that the failover transport would open a connection to 1 of the brokers" - That is completely true. Failover will come in play only if the the wrapping connectionFactory requests a new physical connection to be opened. In this case there will be exactly one connection created for that request.
  • "Why do we see this stream of connections" - I am pretty sure that this is due to having a pooling implementation in your project. You can see that there are those connections established to all of your brokers (randomly distributed), indicating multiple requests for new connections at the same time.

By increasing the loglevel in your application you would be able to see exactly which layer initiates this and for what reason. Possible reasons are: "all connections were expired and the pool restores the minIdleConnection count to the minimum"; "some incoming request in your application requires lots of messages to be sent at once, so your pool creates them".

Gergely Bacso
  • 14,243
  • 2
  • 44
  • 64