3

I'm taking care of a legacy win2k3 (sp2) which has IIS 6 installed. I have a very weird problem which is making me pull my hair out.

  • There are 5 websites configured to run on the particular webserver.
  • Of these, 3 have SSL certs that are functional (and their own dedicated IPs with SSL served on :443)
  • One of these websites is having the following issue:

When the w3svc service is first started (via a reboot, or net stop/start), the initial connections to the SSL service of that particular website immediately fail with ERR_SSL_PROTOCOL_ERROR. I will provide further diagnostic information below.

Now for the list of completely baffling symptoms and attempted remedies:

  • if I click the "stop" icon in the IIS MMC, and then start back up that website (not the w3svc service itself), the problem goes away.
  • if I assign an SSL port other than 443 to the website, the problem never appears
  • if I assign a different SSL cert to the website, the problem persists
  • I created an entirely new website from scratch just to test this, the problem reappears

I can see no relevant log information in the event manager (Application, System, or Security) nor in the w3svc log files.

Filemon does not show anything weird that I can tell (access denied or file not found of any sort), regardless, I doubt the certificate is actually fetched from file.

I have run other tests using wget and the SSLDiagtools, all of which say more of the same without any real further diagnostic information.

A wire shark inspection reveals that the connection is simply dropped:

  • 226 byte "Client Hello" packet is sent by client to server
  • A [FIN,ACK] packet is immediately sent back by the server to the client

When the server is "remedied" using the stop/start method outlined above, the server responds normally (no such FIN/ACK packet is sent).

The above makes me wonder if it's a sudden page fault'y kind of thing that immediately terminates the thread before any logging or anything gets done.

I am at a loss, I have spent over 8 hours on this bug and still can not find a way to tackle it. I would love it if someone actually knew what was happening, but I'm actually just hoping for some ideas on how to further debug/inspect to determine the source of the problem.

NB: There are constraints which I can not avoid, like reinstalling the server or upgrading the server are not options at the moment.

user247243
  • 141
  • 4
  • `"if I assign an SSL port other than 443 to the website, the problem never appears. "` I think this is very significant. – I say Reinstate Monica Oct 09 '14 at 00:48
  • It's significant, but I would argue almost every other point is just as significant. Why would stopping and starting the website resolve this? And more importantly, it's not that the port isn't working: the connection is established as far as TCP is concerned. Which means the site has the correct bindings. – user247243 Oct 09 '14 at 01:06

1 Answers1

1

Ok, so I have finally chanced upon the solution to this problem, however the solution gives me no understanding on what the bug was (I think source code would be needed for that, and given that Win2k3/IIS 6 is no longer supported, I do not see the utility in such a venture). But better write this up for future for the common good.

Following Twisty's advice, I started thinking of the nature of the problem. I first started by stopping all the websites on the server but the one involved, and restarting the webservice. This did nothing.

I then looked at the other inactive websites (ones that had been in a stopped state for a long time running) and checked if they had any possible conflicts. I noticed one of them was configured to use this particular certificate. However, removing the certificate from that site didn't solve the issue. Also, there were other inactive sites that were configured to use other certs, but these other certs didn't display this bug.

TL;DR -- I finally resorted to simply deleting the deprecated website to see if that would do anything and this did resolve the issue. --

I do not know, and do not have the time to (rather my client) go back and create websites to explore what the full bug behaviour is. But I'm assuming it was related to the certs on inactive sites.

Hope this bug never befalls anyone else out there.

user247243
  • 141
  • 4
  • Glad you got it solved. Sometimes simply another set of eyes and perspective is all one needs to cross the final bridge to a resolution. I hope you continue using ServerFault to share your knowledge and questions. – I say Reinstate Monica Oct 09 '14 at 11:43