0

I use IBM XMS to connect to a third party to send and receive messages.

UPDATE:

  • Client .Net Core 3.1
  • IBM XMS library version from Nuget. Tried 9.2.4 and 9.1.5 with same results
  • Same code used to work fine a week ago - so something must have changed in the MQ manager or somewhere in my infrastructure
  • SSL and client certificates

I have been using a receive with timeout for a while without problems but since last week I started to not see any messages to pick - even when they were there - but once I changed to the not timeout receive method I started again to pick messages every 5 minutes.

Looking at the XMS logs I can see the messages are actually read almost immediately with and without timeout but that XMS seems to be deciding to wait for those 5 minutes before returning the message...

I haven't changed anything in my side and the third party reassures they haven't either.

My question is: given the below code used to receive is there anything there that may be the cause of the 5 minutes wait? Any ideas on things I can try? I can share the XMS logs too if that helps.

// This is used to set the default properties in the factory before calling the receive method
        private void SetConnectionProperties(IConnectionFactory cf)
        {
            cf.SetStringProperty(XMSC.WMQ_HOST_NAME, _mqConfiguration.Host);
            cf.SetIntProperty(XMSC.WMQ_PORT, _mqConfiguration.Port);
            cf.SetStringProperty(XMSC.WMQ_CHANNEL, _mqConfiguration.Channel);
            cf.SetStringProperty(XMSC.WMQ_QUEUE_MANAGER, _mqConfiguration.QueueManager);
            cf.SetStringProperty(XMSC.WMQ_SSL_CLIENT_CERT_LABEL, _mqConfiguration.CertificateLabel);
            cf.SetStringProperty(XMSC.WMQ_SSL_KEY_REPOSITORY, _mqConfiguration.KeyRepository);
            cf.SetStringProperty(XMSC.WMQ_SSL_CIPHER_SPEC, _mqConfiguration.CipherSuite);

            cf.SetIntProperty(XMSC.WMQ_CONNECTION_MODE, XMSC.WMQ_CM_CLIENT);
            cf.SetIntProperty(XMSC.WMQ_CLIENT_RECONNECT_OPTIONS, XMSC.WMQ_CLIENT_RECONNECT);
            cf.SetIntProperty(XMSC.WMQ_CLIENT_RECONNECT_TIMEOUT, XMSC.WMQ_CLIENT_RECONNECT_TIMEOUT_DEFAULT);
        }
        
        public IEnumerable<IMessage> ReceiveMessage()
        {
            using var connection = _connectionFactory.CreateConnection();
            using var session = connection.CreateSession(false, AcknowledgeMode.AutoAcknowledge);
            using var destination = session.CreateQueue(_mqConfiguration.ReceiveQueue);
            using var consumer = session.CreateConsumer(destination);

            connection.Start();

            var result = new List<IMessage>();
            var keepRunning = true;
            while (keepRunning)
            {
                try
                {
                    var sw = new Stopwatch();
                    sw.Start();

                    var message = _mqConfiguration.ConsumerTimeoutMs == 0 ? consumer.Receive() 
                        : consumer.Receive(_mqConfiguration.ConsumerTimeoutMs);

                    if (message != null)
                    {
                        result.Add(message);
                        _messageLogger.LogInMessage(message);
                        var ellapsedMillis = sw.ElapsedMilliseconds;
                        if (_mqConfiguration.ConsumerTimeoutMs == 0)
                        {
                            keepRunning = false;
                        }
                    }
                    else
                    {
                        keepRunning = false;
                    }
                }
                catch (Exception e)
                {
                    // We log the exception
                    keepRunning = false;
                }
            }

            consumer.Close();
            destination.Dispose();
            session.Dispose();
            connection.Close();

            return result;
        }
Juan
  • 3,675
  • 20
  • 34
  • 1
    Which MQ version are you using? There were problems with the heartbeat and XMS, see for example [this thread](https://stackoverflow.com/questions/56937216/ibm-mq-client-disconnect-after-10-minutes-ibm-xms-illegalstateexception) – Daniel Steinmann Jan 13 '22 at 09:19
  • I am using the 9.2.4 library - let me dig out what is the version of the third party server – Juan Jan 13 '22 at 09:26
  • Looking at that answer looks like the issue must have been fixed in 9.2.4 - 9.1.4 is the oldest that is available in nuget that should have the fix - going to give it a go – Juan Jan 13 '22 at 09:32
  • 1
    Just asking for some clarity. Is this .NET Standard or .NET core? SSL involved? – Shashi Jan 13 '22 at 09:51
  • 1
    I see you are using SSL. – Shashi Jan 13 '22 at 09:59
  • .net core and SSL yes. Key point is that it has been workign until recently so my bet is that the third party updated the MQ manager and now something is not going ok. I am playing around with changing the Acknowledge mode. Tried with time out and client ack and I get the message immediately but get the "connection closed" error when trying to ack it. I am going to try transaction mode and see if that makes any differnece – Juan Jan 13 '22 at 10:10
  • Maybe it is this [IT34722](https://www.ibm.com/support/pages/apar/IT34722)? 9.2.5 is not out yet but 9.2.0.4 is you could try with that version to see if it helps. – JoshMc Jan 13 '22 at 10:22
  • 1
    Note the problem solved in IT34722 was introduced in 9.1.4 so a downgrade to 9.1.3 would also work. – JoshMc Jan 13 '22 at 10:31
  • Thanks @JoshMc I will try to grab the 9.1.3 version of dlls (not in nuget :(). I don't think it is that though as when requesting without timeout the messages arrive eventually... and when using time out not even the first one works – Juan Jan 13 '22 at 10:38
  • 1
    Then it may be [IJ20591: Managed .NET SSL application making MQGET calls unexpectedly receives MQRC_CONNECTION_BROKEN when running in .NET Core](https://www.ibm.com/support/pages/apar/IJ20591), no patch released yet. Impacts messages large than 15kb with .net core on TLS channels. See also this [thread](https://community.ibm.com/community/user/integration/communities/community-home/digestviewer/viewthread?GroupId=379&MessageKey=98ea5faa-1bb3-4770-819b-08dc30318149). Did your message size increase recently? – JoshMc Jan 13 '22 at 10:58
  • Rockstar @JoshMc! That sounds exactly like my problem, I will try to get the ifix through the third party as they are the IBM clients - it may have well be an increase in message size... 15kb doens't sounds like too much! – Juan Jan 13 '22 at 11:17
  • Interesting that an possible workaround is to change the default heartbeat - not sure how to do that though. We are not IBM clients so not sure we can get the ifix ourselves asking their support. – Juan Jan 13 '22 at 11:22
  • 1
    See the thread Daniel Steinmann linked too, I describe how HBINT is negotiated in my answer If you are not using a CCDT it should negotiate to the HBINT of the queue manager SVRCONN channel. Note the issue is also solved if you switch to framework instead of core. – JoshMc Jan 13 '22 at 11:35
  • Thanks @JoshMc I may have to go to framework for now but I would try to avoid it if possible – Juan Jan 13 '22 at 11:55
  • @JoshMc looks like getting the heartbeat down is making the trick for us until the proper fix is published in Q1. If you write your comments as an answer I would be happy to accept it! – Juan Jan 13 '22 at 15:07
  • 1
    Glad it is working. – JoshMc Jan 13 '22 at 19:18

1 Answers1

1

The symptoms look like a match for APAR IJ20591: Managed .NET SSL application making MQGET calls unexpectedly receives MQRC_CONNECTION_BROKEN when running in .NET Core. This impacts messages larger than 15kb and IBM MQ .net standard (core) libraries using TLS channels. See also this thread. This will be fixed in 9.2.0.5, no CDS release is listed.

It states:

Setting the heartbeat interval to lower values may reduce the frequency of occurrence.

If your .NET application is not using a CCDT you can lower the heartbeat by having the SVRCONN channel's HBINT lowered and reconnecting your application.

JoshMc
  • 10,239
  • 2
  • 19
  • 38
  • Thanks again Josh. That seems to have been the problem, still a mistery why it started to manifest now as the message size as reminded similar, but reducing the Heartbeat to 10 seconds has improved considerably our case – Juan Jan 14 '22 at 08:11