-1

Server 1 is sending an xml message via IIS to Server 2. Server 2 receives it, and send back an acknowledgment message to Server 1. Upon receipt of that message, Server 1 sends the next message in the queue.

However, Server 1 intermittently (4/5 times a week) does not receive the acknowledgment message (we tested the issue and proved that Server 1 is sending the acknowledgment message).

The IIS logs for the time is is occurring tells us there's an error 1236 (sc-win32-status 1236 - which means "The network connection was aborted by the local system").

We're at a loss as to what is causing this or how to fix it. Interested to see if anyone has come across an issue like this before...

Paul S
  • 113
  • 1
  • 1
  • 13
  • Don't put your guess on 1236 yet, as that can mean quite a lot. "Server 1 is sending an xml message via IIS to Server 2" is not a useful description at all. If you cannot share some code, IIS log files fragments, or other useful information, this question won't be answerable. – Lex Li Oct 23 '18 at 00:36

1 Answers1

0

How did you prove that Server 2 is sending the acknowledgement message -- through network tracing on Server 1, or some other means? Logs within the software may not be enough. Barring anything bad going on at the networking level, it is possible that one of the sides is having an exception, and aborting the connection as a result. The application pools may be auto-recycling due to IIS recycle rules, and although IIS should properly handle it a pool re-start, maybe something did not occur as expected. When one pool starts, and the other one is processing the final requests on shutdown, maybe there is some locking going on, not expecting two processes running at the same time.

Brian Clink
  • 297
  • 2
  • 17
  • Yes, confirmed via Fiddler, a tracing tool. Hundreds of these xml messages are coming from server 1 to server 2 per minute. Have never had this issue before. It only occurred "randomly" in the last 3 months. However it's happening every day or two now. As server 1 is not receiving the acknowledgement, it does not send the remaining ones so we're left with a massive backlog everytime this happens. – Paul S Oct 23 '18 at 00:26
  • Is there any pattern -- a certain number of requests, a certain request/response? Anything different with that particular response? Are you certain the code on server 1 did not start processing the acknowledgement and might have failed silently? I would look at two levels -- the application level and the networking level. See if there is a TCP reset on the network layer, who is sending the TCP reset (the sender or the receiver). Find a way to confirm whether the software did start processing the data. – Brian Clink Oct 24 '18 at 17:57