6

MQQueue.Get() operation freezes for 5 minutes when getting a message from QueueManager using SSL on .NetCore 3.1. The same code works fine when running on .Net Framework (v4.8) or without SSL on both runtimes.

Details:

  • Client: MQ Managed Client from NuGet (IBMMQDotnetClient/v9.1.4)
  • Server: docker from tag ibmcom/mq:9.1.4.0-r1 with basic SSL config using SSLCIPH(ANY_TLS12) SSLCAUTH(OPTIONAL)
  • Reproduced on: .Net Core 3.1.2 on Windows and .Net Core 3.1.1 on Linux
  • MQ Logs: MqGet-framework48.zip, MqGet-core31.zip

My Findings:

  • The attached MqGet-core31 tells:
    1. The client sends the get message request at 09:15:52.199588
    2. It receives response with message body almost immediately at 09:15:52.311050
    3. Then it calls another MQRcvThread.ReceiveOneTSH() to finish the get operation. It calls MQTCPConnection.Receive() and it freeze for 5 minutes to 09:20:52.412669
  • Possible reason:
    • MQTCPConnection keeps original socket and MQEncryptedSocket instance wrapping the same socket via SslStream
    • The method MQTCPConnection.Receive() polls on the socket and then it calls MQEncryptedSocket.Read()
    • It looks like that data from the socket are read by SslStream in MQEncryptedSocket before the poll() method is called and poll() waits for its 5 minutes timeout. Or any other race condition around?
    • If I put a breakpoint before problematic poll() call and delay it for a moment, then it often passess without any delay!

Test Code:

This code finishes in ~2 seconds on .Net Framework, but it needs 5 minutes and ~2 seconds when it is run on .Net Core. (Win or Linux)

// import server's CA
MQConnectionPool.ImportServerCertificateAuthorityIfNotPresent(CertMqTestCa);

Hashtable properties = new Hashtable {
    {MQC.HOST_NAME_PROPERTY, Mq1QmSslAnyTls12.Host},
    {MQC.PORT_PROPERTY, Mq1QmSslAnyTls12.Port},
    {MQC.CHANNEL_PROPERTY, Mq1QmSslAnyTls12.ChannelName},
    {MQC.USER_ID_PROPERTY, "admin"},
    {MQC.PASSWORD_PROPERTY, "changeit"},
    {MQC.CCSID_PROPERTY, 819},
    {MQC.SSL_CIPHER_SPEC_PROPERTY, "TLS_RSA_WITH_AES_256_CBC_SHA256"},
    {MQC.SSL_CERT_STORE_PROPERTY, "*USER"}
};
MQQueueManager queueManager;
using (new MQCcsidSetter(properties)) {
    queueManager = new MQQueueManager(null, properties);
}

string requestText = Guid.NewGuid().ToString();
string queueName = "SV.MqConnectionPoolTest";
// put message - works fine
using (MQQueue queue = queueManager.AccessQueue(queueName, MQC.MQOO_OUTPUT + MQC.MQOO_FAIL_IF_QUIESCING)) {
    byte[] requestBytes = Encoding.UTF8.GetBytes(requestText);
    MQMessage requestMessage = new MQMessage {Expiry = 3000, Priority = 4, CharacterSet = 1208, MessageType = MQC.MQMT_DATAGRAM};
    requestMessage.Write(requestBytes, 0, requestBytes.GetLength(0));
    queue.Put(requestMessage, new MQPutMessageOptions());
}

// get message back from the same queue
using (MQQueue queue = queueManager.AccessQueue(queueName, MQC.MQOO_INPUT_AS_Q_DEF + MQC.MQOO_FAIL_IF_QUIESCING)) {
    while (true) {
        MQMessage msg = new MQMessage();
        queue.Get(msg, new MQGetMessageOptions()); // <<= !!!IT IS DELAYED HERE!!!
        msg.Seek(0);
        string msgContent = msg.ReadString(msg.MessageLength);
        if (requestText.Equals(msgContent)) {
            break;
        }
    }
}

Thread Dumps:

  • MQ Client Receiver Thread
SocketPal.Poll()
Socket.Poll()
MQTCPConnection.Receive() 
MQRcvThread.ReceiveBuffer()
MQRcvThread.ReceiveOneTSH()
MQRcvThread.Run()
ThreadHelper.ThreadStart_Context()
ExecutionContext.RunInternal()
ThreadHelper.ThreadStart()

Waits on Poll(), but it ends with 5 minute (heartbeat) timeout. Then it goes to this.network.Read() and immediately gets proper data.

    this.socket.SetSocketOption(SocketOptionLevel.Socket, SocketOptionName.ReceiveTimeout, this.timeout);
    if (this.socket.Poll(this.timeout * 1000, SelectMode.SelectRead))
    {
      length2 = this.network.Read(cBuffer, offset1, length1);
  • Application Thread
Monitor.Wait()
MQSession.ExchangeTSH() 
MQProxyQueue.RequestMessages()
MQProxyQueue.FlushQueue()
MQProxyQueue.ProxyMQGET()
MQFAP.zstMQGET()
MQFAP.MQGET()
MQDestination.Get()
MQDestination.Get()
MqConnectionPoolTest.TestMqDirect()

Waits for entry.Reply to be filled by receive thread:

    while (entry.Reply == null) {
      Monitor.Wait((object) entry, this.rmtReqEntMaxPollTime, true);
      ...

Please, does anybody know a workaround or has a fix?

2 Answers2

1

You should not Get and wait infinitely, you should do it like this:

const int TIMEOUTTIME = 20000;
try
{
    MQGetMessageOptions gmo = new MQGetMessageOptions();
    gmo.Options = MQC.MQGMO_WAIT;
    gmo.WaitInterval = TIMEOUTTIME;
    queue.Get(msg, gmo);
    msg.Seek(0);
    string msgContent = msg.ReadString(msg.MessageLength);
    if (requestText.Equals(msgContent)) {
        break;
    }
}
catch (MQException ex)
{
    if (ex.CompletionCode == MQC.MQCC_FAILED && ex.ReasonCode == MQC.MQRC_NO_MSG_AVAILABLE)
    {
        // Log on DEBUG level something like this:
        //    No message from <queue> after <TIMEOUTTIME / 1000> seconds, continue receiving");
        continue;
    }
    throw ex;
}

To make your programmer's life easier, I would recommend to use the XMS API.

Daniel Steinmann
  • 2,119
  • 2
  • 15
  • 25
  • Yes, it is possible to use a timeout or a separated reader thread in basic cases. But our application is able to handle thousands messages per second. Each message needs to be processed in milliseconds, so waiting for any timeout is not solution. The bug is probably inside MQ Client (or .Net Core?) and the message is already consumed from server's point of view. If I will use the proposed solution (regardless the performance), I will not get the message in repeated get. Anyway, thank you for proposed workaround. – Pavel Chuchma Mar 27 '20 at 14:06
  • The wait is just how long until you get a no message available result. You still receive the message as soon as it hits the queue. – JoshMc Mar 27 '20 at 15:30
  • Not exactly. The issue described above is inside Managed MQ Client on Linux. It immediately consumes the message from the queue, but queue.Get() method is delayed for 5 minutes timeout. – Pavel Chuchma Mar 30 '20 at 09:25
  • The 5 minutes (300 seconds) is the default Heartbeat Interval, see [HBINT](https://www.ibm.com/support/knowledgecenter/en/SSFKSJ_9.1.0/com.ibm.mq.ref.con.doc/q081890_.htm). Could it has to do with this timeout setting? There was already a Heartbeat related [issue](https://stackoverflow.com/a/50678569/638413), maybe you can check the thread dump and see where you are stuck. – Daniel Steinmann Mar 30 '20 at 14:41
  • Yes, `Poll()` successfully ends with heartbeat timeout. Heartbeat timeout is used in the same method before the `Poll()` call. I added thread dumps to the original post. If I put breakpoint on line with `socket.Poll()`, and I press continue on each break, then test passes without freeze. – Pavel Chuchma Mar 30 '20 at 17:17
  • 1
    @PavelChuchma I think the issue you face is a defect, if you have a IBM support contract or if you can get the MQ server team you connect to to open a case with IBM they can confirm this with the information you have provided. 9.1.5 is the next release that will come out from IBM, if this is a known issue it should be resolved in that release. As for the issue you are having, you may be able to work around it by asking the MQ Admin to lower the HBINT to something less than 300 seconds, it is explained in much detail in the link to my prior answer provided by Daniel, but something like 15. – JoshMc Mar 30 '20 at 18:44
  • @PavelChuchma It appears that the get with wait is a work around to the issue. As far as reporting it, as I mentioned above: If you have a IBM support contract or if you can get the MQ server team you connect to to open a case with IBM they can confirm this with the information you have provided. – JoshMc Apr 02 '20 at 16:39
0

Just an update. I opened a case with IBM in feb 2020. Response: "If the .NET client is connecting to a back level QMgr (other than 9.1.4) then we were not allocating memory for conntag that causes System.ArgumentNullException" They fixed it in Managed IBM MQ .Net Client v9.1.6 I believe. So either make sure the MQ server is more recent or use the more up to date client.