7

I must say that I am not the guy that set up the server, but the poor sap tasked with finding out what is happening for the time being. I only have a rough knowledge about apache and linux, so please bear with me...

The Problem

An apache server of ours shuts down (mostly during the night) and fails to recover and start back up again, without anyone actually telling it to.

From what I can gather, the interesting thing here is the following line in the apache error logs:

[Wed Apr 15 03:43:02.114879 2015] [mpm_prefork:notice] [pid 25778] AH00171: Graceful restart requested, doing restart

After which comes a very long stacktrace, here are the first few lines:

*** Error in `/usr/sbin/httpd': free(): invalid pointer: 0x00007f581d5c13c0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7d19d)[0x7f583b69519d]
/etc/httpd/modules/libphp5.so(php_module_shutdown+0x2b)[0x7f58301d255b]
/etc/httpd/modules/libphp5.so(php_module_shutdown_wrapper+0x9)[0x7f58301d2619]
[...]

The backtrace goes on for a while, but the interesting thing here is, that in between that, the Error in /usr/sbin/httpd [...] repeats itself 5 times, until the last message is

[Wed Apr 15 03:43:02.269626 2015] [core:notice] [pid 25778] AH00060: seg fault or similar nasty error detected in the parent process

The next message is on the next day me booting the server up again:

[Wed Apr 15 08:14:46.200884 2015] [core:notice] [pid 30326] SELinux policy enabled; httpd running as context system_u:system_r:httpd_t:s0
[Wed Apr 15 08:14:46.215410 2015] [suexec:notice] [pid 30326] AH01232: suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)
[Wed Apr 15 08:14:46.235346 2015] [auth_digest:notice] [pid 30326] AH01757: generating secret for digest authentication ...
[Wed Apr 15 08:14:46.236045 2015] [lbmethod_heartbeat:notice] [pid 30326] AH02282: No slotmem from mod_heartmonitor
[Wed Apr 15 08:14:46.280992 2015] [core:warn] [pid 30326] AH00098: pid file /run/httpd/httpd.pid overwritten -- Unclean shutdown of previous Apache run?
[Wed Apr 15 08:14:46.284919 2015] [mpm_prefork:notice] [pid 30326] AH00163: Apache/2.4.6 (CentOS) PHP/5.4.16 configured -- resuming normal operations
[Wed Apr 15 08:14:46.284939 2015] [core:notice] [pid 30326] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'

Thoughts

As I see it, the prefork-module somehow requests that the apache process (or one of them?) shuts down. This fails and in return the whole things just crashes horribly.

Questions

First and foremost: Is my analysis - that the problem crashing the server is the prefork module - correct?

Should I just disable the prefork module? I don't see how I can debug / help a memory error inside the software itself.

What implications does it have when I disable this module?

Version information

Linux version 3.10.0-123.13.2.el7.x86_64 (builder@kbuilder.dev.centos.org) (gcc version 4.8.2 20140120 (Red Hat 4.8.2-16) (GCC) )

Server version: Apache/2.4.6 (CentOS)
Server built:   Jan 12 2015 13:22:31
Server's Module Magic Number: 20120211:23
Server loaded:  APR 1.4.8, APR-UTIL 1.5.2
Compiled using: APR 1.4.8, APR-UTIL 1.5.2
Architecture:   64-bit
Server MPM:     prefork
  threaded:     no
    forked:     yes (variable process count)
Server compiled with....
 -D APR_HAS_SENDFILE
 -D APR_HAS_MMAP
 -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
 -D APR_USE_SYSVSEM_SERIALIZE
 -D APR_USE_PTHREAD_SERIALIZE
 -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
 -D APR_HAS_OTHER_CHILD
 -D AP_HAVE_RELIABLE_PIPED_LOGS
 -D DYNAMIC_MODULE_LIMIT=256
 -D HTTPD_ROOT="/etc/httpd"
 -D SUEXEC_BIN="/usr/sbin/suexec"
 -D DEFAULT_PIDLOG="/run/httpd/httpd.pid"
 -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
 -D DEFAULT_ERRORLOG="logs/error_log"
 -D AP_TYPES_CONFIG_FILE="conf/mime.types"
 -D SERVER_CONFIG_FILE="conf/httpd.conf"

Responses to comments

cron

There is no /etc/cron.d/dailyjobs, but only a 0hourly script, which executes the hourly scripts 0anacron, 0yum-hourly.cron and dellrda.cron - None of which seem to do anything related to the apache (IMHO)

/etc/logrotate.d/httpd

/var/log/httpd/*log {
    missingok
    notifempty
    sharedscripts
    delaycompress
    postrotate
        /bin/systemctl reload httpd.service > /dev/null 2>/dev/null || true
    endscript
}

Manual reload

/bin/systemctl reload httpd.service

Brings the following result

Job for httpd.service failed. See 'systemctl status httpd.service' and 'journalctl -xn' for details.

And in the error_log the same messages as above are shown.

A quick systemctl status httpd.service reveals:

httpd.service - The Apache HTTP Server
   Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled)
   Active: failed (Result: signal) since Fri 2015-04-17 12:26:36 CEST; 8s ago
  Process: 8828 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited, status=0/SUCCESS)
  Process: 8826 ExecReload=/usr/sbin/httpd $OPTIONS -k graceful (code=exited, status=0/SUCCESS)
  Process: 8767 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND (code=killed, signal=ABRT)
 Main PID: 8767 (code=killed, signal=ABRT)
   Status: "Total requests: 0; Current requests/sec: 0; Current traffic:   0 B/sec"

apachectl -M

Loaded Modules:
 core_module (static)
 so_module (static)
 http_module (static)
 access_compat_module (shared)
 actions_module (shared)
 alias_module (shared)
 allowmethods_module (shared)
 auth_basic_module (shared)
 auth_digest_module (shared)
 authn_anon_module (shared)
 authn_core_module (shared)
 authn_dbd_module (shared)
 authn_dbm_module (shared)
 authn_file_module (shared)
 authn_socache_module (shared)
 authz_core_module (shared)
 authz_dbd_module (shared)
 authz_dbm_module (shared)
 authz_groupfile_module (shared)
 authz_host_module (shared)
 authz_owner_module (shared)
 authz_user_module (shared)
 autoindex_module (shared)
 cache_module (shared)
 cache_disk_module (shared)
 data_module (shared)
 dbd_module (shared)
 deflate_module (shared)
 dir_module (shared)
 dumpio_module (shared)
 echo_module (shared)
 env_module (shared)
 expires_module (shared)
 ext_filter_module (shared)
 filter_module (shared)
 headers_module (shared)
 include_module (shared)
 info_module (shared)
 log_config_module (shared)
 logio_module (shared)
 mime_magic_module (shared)
 mime_module (shared)
 negotiation_module (shared)
 remoteip_module (shared)
 reqtimeout_module (shared)
 rewrite_module (shared)
 setenvif_module (shared)
 slotmem_plain_module (shared)
 slotmem_shm_module (shared)
 socache_dbm_module (shared)
 socache_memcache_module (shared)
 socache_shmcb_module (shared)
 status_module (shared)
 substitute_module (shared)
 suexec_module (shared)
 unique_id_module (shared)
 unixd_module (shared)
 userdir_module (shared)
 version_module (shared)
 vhost_alias_module (shared)
 dav_module (shared)
 dav_fs_module (shared)
 dav_lock_module (shared)
 lua_module (shared)
 mpm_prefork_module (shared)
 proxy_module (shared)
 lbmethod_bybusyness_module (shared)
 lbmethod_byrequests_module (shared)
 lbmethod_bytraffic_module (shared)
 lbmethod_heartbeat_module (shared)
 proxy_ajp_module (shared)
 proxy_balancer_module (shared)
 proxy_connect_module (shared)
 proxy_express_module (shared)
 proxy_fcgi_module (shared)
 proxy_fdpass_module (shared)
 proxy_ftp_module (shared)
 proxy_http_module (shared)
 proxy_scgi_module (shared)
 systemd_module (shared)
 cgi_module (shared)
 php5_module (shared)
Damon Hill
  • 77
  • 2
  • 13
F.P
  • 153
  • 4
  • 15
  • 2
    The graceful restart in the middle of the night is probably caused by logrotate running at this time. When does logrotate run and what's in your apache's logrotate script? Please post the content of `/etc/cron.d/dailyjobs` and `/etc/logrotate.d/httpd` – etagenklo Apr 17 '15 at 09:54
  • @etagenklo Done - I think the `/bin/systemctl reload httpd` is the culprit restarting apache, right? – F.P Apr 17 '15 at 10:03
  • 1
    What happens if you do the reload manually (i. e. issue the command `/bin/systemctl reload httpd.service` ? – etagenklo Apr 17 '15 at 10:23
  • @etagenklo The `error_log` shows the same messages as posted above, and the `systemctl` tells me that the *Job for httpd.service failed*. – F.P Apr 17 '15 at 10:29
  • Could you post the output of `apachectl -M` ? – etagenklo Apr 17 '15 at 10:47
  • @etagenklo Oh, well - thats a long one... But okay :D – F.P Apr 17 '15 at 10:52
  • 1
    You menton it's 'mostly at night', suggesting that it's happening at other times. If it's not happening at a really consistent time (indicating a cron job), you might want to montitor memory usage and disk space on /tmp and / ... I've seen similar issues before when a CentOS system ran out of memory, and I can't remember if it restarted or just shut things down. – Joe H. Apr 17 '15 at 14:22

1 Answers1

4

My guess was that you hit this bug: http://bugs.centos.org/view.php?id=7669 , but I don't see mod_gnutls in your loaded modules. Nevertheless, to me it seems like you've hit some kind of bug. As I can see, you don't use the latest httpd version, so you should first try and update your httpd package:

yum update

Please restart your httpd afterwards, then try if the reload problem persists. If it's still there, I'd suggest to open a bug report at CentOS for this. To avoid the crash in the middle of the night in the meantime, I'd suggest to edit your logrotate script to use

/bin/systemctl restart httpd.service

instead of it's reload counterpart until the issue is solved.

EDIT:

Before opening a bug report at CentOS, you should make sure that you are using only standard CentOS apache packages and modules. If you use any apache modules that are self compiled or installed from 3rd party repositories, they probably won't accept this.

To show all installed packages with the repo they come from, you can use the command

rpm -qa --qf '%{NAME} %{VENDOR}\n'
etagenklo
  • 5,834
  • 1
  • 27
  • 32
  • Wow, what a research. I will look into these bugs and see if I can manage to update as soon as something happens there. In the meantime, thanks for the aweseome help. I can only upvote once unfortunately, you deserve a lot more :-) – F.P Apr 17 '15 at 11:16
  • Thanks for the upvote, I've edited my answer with an additional thought... – etagenklo Apr 17 '15 at 12:03