I am using Pacemaker with Corosync to set up a basic Apache HA cluster with 3 nodes running CentOS. For some reasons, I cannot get the apache resource started in pcs.
Cluster IP: 192.168.200.40
# pcs resource show ClusterIP
Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)
Attributes: cidr_netmask=24 ip=192.168.200.40
Operations: monitor interval=20s (ClusterIP-monitor-interval-20s)
start interval=0s timeout=20s (ClusterIP-start-interval-0s)
stop interval=0s timeout=20s (ClusterIP-stop-interval-0s)
# pcs resource show WebServer
Resource: WebServer (class=ocf provider=heartbeat type=apache)
Attributes: configfile=/etc/httpd/conf/httpd.conf statusurl=http://localhost/server-status
Operations: monitor interval=1min (WebServer-monitor-interval-1min)
start interval=0s timeout=40s (WebServer-start-interval-0s)
stop interval=0s timeout=60s (WebServer-stop-interval-0s)
# pcs status
Cluster name:
WARNING: corosync and pacemaker node names do not match (IPs used in setup?)
Stack: corosync
Current DC: server3.example.com (version 1.1.18-11.el7_5.2-2b07d5c5a9) - partition with quorum
Last updated: Thu Jun 7 21:59:09 2018
Last change: Thu Jun 7 21:45:23 2018 by root via cibadmin on server1.example.com
3 nodes configured
2 resources configured
Online: [ server1.example.com server2.example.com server3.example.com ]
Full list of resources:
ClusterIP (ocf::heartbeat:IPaddr2): Started server2.example.com
WebServer (ocf::heartbeat:apache): Stopped
Failed Actions:
* WebServer_start_0 on server3.example.com 'unknown error' (1): call=49, status=Timed Out, exitreason='',
last-rc-change='Thu Jun 7 21:46:03 2018', queued=0ms, exec=40002ms
* WebServer_start_0 on server1.example.com 'unknown error' (1): call=53, status=Timed Out, exitreason='',
last-rc-change='Thu Jun 7 21:45:23 2018', queued=0ms, exec=40003ms
* WebServer_start_0 on server2.example.com 'unknown error' (1): call=47, status=Timed Out, exitreason='',
last-rc-change='Thu Jun 7 21:46:43 2018', queued=1ms, exec=40002ms
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
The httpd instance is enabled and running on all three nodes. The cluster IP and individual node IPs are able to access the web page. The ClusterIP resource also works well for failover. What may go wrong for the apache resource in this case?
Thank you very much!
Update:
Here is more information from the debug output. It seems the Apache is unable to bind to the port, but there is no error from the apache log, and systemctl status httpd
gave all green on all nodes. I can open web pages via the cluster IP and node IPs. The ClusterIP resource failover works fine, too. Any idea on why Apache resource doesn't work with pacemaker?
# pcs resource debug-start WebServer --full
Operation start for WebServer (ocf:heartbeat:apache) failed: 'Timed Out' (2)
> stderr: ERROR: (98)Address already in use: AH00072: make_sock: could not bind to address [::]:80 (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:80 no listening sockets available, shutting down AH00015: Unable to open logs
> stderr: INFO: apache not running
> stderr: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up
> stderr: INFO: apache not running
> stderr: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up
> stderr: INFO: apache not running
> stderr: INFO: waiting for apache /etc/httpd/conf/httpd.conf to come up
> stderr: INFO: apache not running