0

I have this Windows Server 2003 R2 32 bit machine running Apache 2.4.2 with OpenSSL 1.0.1c and PHP 5.4.5 via mod_fcgid 2.3.7. This config worked just fine for some hours, but then the site couldn't be reached with its domain name, say www.example.com, but it could be still reached by its IP address.

In particular, while https://www.example.com/ yielded a connection error, http://123.1.2.3/ worked just fine. Yes, first https then http.

Error and access logs were clean, i.e. they showed no signs of problems. Just the usual messages, that were interrupted while the site couldn't be reached.

After some investigation, a simple restart of Apache solved the problem. Unfortunately, I didn't have the chance to test if https://123.1.2.3/ worked as well, or if http://www.example.com/ was still redirected to https as usual.

So, has anyone have any idea of what happened? Before I get tired of Apache and ditch it in favor of Nginx?

Edit: Some log informations.

The last line of sslerror.log is from 90 minutes before the problem occurred, so I guess it's not important. ssl_request.log shows nothing interesting, too: these are the last two lines before the problem:

[28/Aug/2012:17:47:54 +0200] x.x.x.x TLSv1.1 ECDHE-RSA-AES256-SHA "GET /login HTTP/1.1" 1183
[28/Aug/2012:17:47:45 +0200] y.y.y.y TLSv1 ECDHE-RSA-AES256-SHA "POST /upf HTTP/1.1" 73

The previous lines are all the same and don't seem interesting, except 4 lines like these 30-40 seconds before the problem:

[28/Aug/2012:17:47:14 +0200] z.z.z.z TLSv1 ECDHE-RSA-AES256-SHA "-" -

These are the corrisponding lines from sslaccess.log:

z.z.z.z - - [28/Aug/2012:17:47:14 +0200] "-" 408 -
...
x.x.x.x - - [28/Aug/2012:17:47:54 +0200] "GET /login HTTP/1.1" 200 1183
y.y.y.y - - [28/Aug/2012:17:47:45 +0200] "POST /upf HTTP/1.1" 200 73

It seems some connections timed out?

The virtual server listening on port 80 usually redirects all the connections to the https protocol, so access.log isn't showing anything since 40 minutes before the problem. error.log shows some warnings 4 minutes before the issue:

[Tue Aug 28 17:53:30.921034 2012] [fcgid:warn] [pid 1964:tid 1728] mod_fcgid: process 1852 graceful kill fail, sending SIGKILL

A get a lot of these warning, I guess it's normal?

MaxArt
  • 515
  • 1
  • 7
  • 14
  • Without relevant information from log files and exact error messages, we will just be guessing! – Khaled Aug 29 '12 at 08:47
  • @Khaled I'll try to provide some, but I really don't know what to provide. I'm not an expert, so can you tell me what can be useful? – MaxArt Aug 29 '12 at 08:51

1 Answers1

1

This sounds like a DNS issue. When the site becomes unreachable by name you need to first ensure that the name correctly resolves to the IP address of the server. There are many ways to do this, such as performing an nslookup or even just a ping of the name. Only if you are indeed getting the correct address should you start looking at the Apache end of things.

John Gardeniers
  • 27,458
  • 12
  • 55
  • 109
  • Ah, yes, I forgot to mention it: at first I thought about the DNS too, but a simple `tracert` was able to resolve it to its IP address. The request failed just when it was about to reach the host. – MaxArt Aug 29 '12 at 10:28