Corosync cluster will not bind vips to loopback, so apache fails to start in CentOS 7

Question

I have a problem getting apache to work in a corosync cluster.

I probably sifted through more than hundred web pages and a couple of dozen Google Searches, and was not able to find any matching answer on my issue.

root@hh1web03t ~# uname -a
Linux hh1web03t 3.10.0-693.17.1.el7.x86_64 #1 SMP Thu Jan 25 20:13:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
root@hh1web03t ~# more /etc/centos-release
CentOS Linux release 7.4.1708 (Core)
root@hh1web03t ~# yum list installed corosync httpd crmsh ldirectord
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirror.ratiokontakt.de
 * epel: mirror.speedpartner.de
 * extras: mirror.checkdomain.de
 * updates: mirror.ratiokontakt.de
Installed Packages
corosync.x86_64        2.4.0-9.el7_4.2             @updates
crmsh.noarch           3.0.0-6.2                   @network_ha-clustering_Stable
httpd.x86_64           2.4.6-67.el7.centos.6       @updates
ldirectord.x86_64      3.9.6-0rc1.1.2              @network_ha-clustering_Stable

We have 4 physical IPs and 10 vips. crm status looks like this:

root@hh1web03t ~# crm status

Stack: corosync
Current DC: hh1web03t (version 1.1.16-12.el7_4.7-94ff4df) - partition with quorum
Last updated: Thu Mar 29 15:53:27 2018
Last change: Thu Mar 29 15:28:47 2018 by hacluster via crmd on hh1web01t

4 nodes configured
16 resources configured

Online: [ hh1web01t hh1web02t hh1web03t hh1web04t ]

Full list of resources:

 pingd  (ocf::pacemaker:ping):  Started hh1web03t
 Resource Group: gp_LVS
     ldirectord (ocf::heartbeat:ldirectord):    Started hh1web03t
     vip_151    (ocf::heartbeat:IPaddr2):       Started hh1web03t
     vip_152    (ocf::heartbeat:IPaddr2):       Started hh1web03t
     vip_153    (ocf::heartbeat:IPaddr2):       Started hh1web03t
     vip_154    (ocf::heartbeat:IPaddr2):       Started hh1web03t
     vip_155    (ocf::heartbeat:IPaddr2):       Started hh1web03t
     vip_156    (ocf::heartbeat:IPaddr2):       Started hh1web03t
     vip_157    (ocf::heartbeat:IPaddr2):       Started hh1web03t
     vip_158    (ocf::heartbeat:IPaddr2):       Started hh1web03t
     vip_159    (ocf::heartbeat:IPaddr2):       Started hh1web03t
     vip_160    (ocf::heartbeat:IPaddr2):       Started hh1web03t
 Clone Set: cl_vip151 [vip_151_apache]
     Started: [ hh1web01t hh1web02t hh1web03t hh1web04t ]

But that is only with a "Listen 80"-statement in the "httpd.conf". As soon as I the the VIP to the Listen (Listen 10.49.4.151:80), the startup of the httpd fails.

From another cluster I know that the vips should be in standby on the "lo" loopback-interface, but they are not. So it is my assumption that my problem lies within my cluster configuration, and not the apache configuration.

Active Node:

root@hh1web03t ~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
    link/ether 00:50:56:aa:1d:ca brd ff:ff:ff:ff:ff:ff
    inet 10.49.4.103/24 brd 10.49.4.255 scope global ens192
       valid_lft forever preferred_lft forever
    inet 10.49.4.151/24 brd 10.49.4.255 scope global secondary ens192:151
       valid_lft forever preferred_lft forever
    inet 10.49.4.152/24 brd 10.49.4.255 scope global secondary ens192:152
       valid_lft forever preferred_lft forever
    inet 10.49.4.153/24 brd 10.49.4.255 scope global secondary ens192:153
       valid_lft forever preferred_lft forever
    inet 10.49.4.154/24 brd 10.49.4.255 scope global secondary ens192:154
       valid_lft forever preferred_lft forever
    inet 10.49.4.155/24 brd 10.49.4.255 scope global secondary ens192:155
       valid_lft forever preferred_lft forever
    inet 10.49.4.156/24 brd 10.49.4.255 scope global secondary ens192:156
       valid_lft forever preferred_lft forever
    inet 10.49.4.157/24 brd 10.49.4.255 scope global secondary ens192:157
       valid_lft forever preferred_lft forever
    inet 10.49.4.158/24 brd 10.49.4.255 scope global secondary ens192:158
       valid_lft forever preferred_lft forever
    inet 10.49.4.159/24 brd 10.49.4.255 scope global secondary ens192:159
       valid_lft forever preferred_lft forever
    inet 10.49.4.160/24 brd 10.49.4.255 scope global secondary ens192:160
       valid_lft forever preferred_lft forever

Standby Node:

root@hh1web04t ~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: ens192: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 100
    link/ether 00:50:56:aa:0c:a9 brd ff:ff:ff:ff:ff:ff
    inet 10.49.4.104/24 brd 10.49.4.255 scope global ens192
       valid_lft forever preferred_lft forever

Here is what telnet and netstat show:

root@hh1web03t ~# telnet 10.49.4.151 80
Trying 10.49.4.151...

root@hh1web03t ~# telnet 10.49.4.101 80
Trying 10.49.4.101...
Connected to 10.49.4.101.
Escape character is '^]'.


^C
Connection closed by foreign host.
root@hh1web03t ~#
root@hh1web03t ~# netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      3470/httpd
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      1093/sshd
tcp        0      0 0.0.0.0:443             0.0.0.0:*               LISTEN      3470/httpd
tcp6       0      0 :::2224                 :::*                    LISTEN      818/ruby
tcp6       0      0 :::22                   :::*                    LISTEN      1093/sshd
tcp6       0      0 :::6556                 :::*                    LISTEN      1098/xinetd
udp        0      0 10.49.4.160:123         0.0.0.0:*                           847/ntpd
udp        0      0 10.49.4.159:123         0.0.0.0:*                           847/ntpd
udp        0      0 10.49.4.158:123         0.0.0.0:*                           847/ntpd
udp        0      0 10.49.4.157:123         0.0.0.0:*                           847/ntpd
udp        0      0 10.49.4.156:123         0.0.0.0:*                           847/ntpd
udp        0      0 10.49.4.155:123         0.0.0.0:*                           847/ntpd
udp        0      0 10.49.4.154:123         0.0.0.0:*                           847/ntpd
udp        0      0 10.49.4.153:123         0.0.0.0:*                           847/ntpd
udp        0      0 10.49.4.152:123         0.0.0.0:*                           847/ntpd
udp        0      0 10.49.4.151:123         0.0.0.0:*                           847/ntpd
udp        0      0 10.49.4.103:123         0.0.0.0:*                           847/ntpd
udp        0      0 127.0.0.1:123           0.0.0.0:*                           847/ntpd
udp        0      0 0.0.0.0:123             0.0.0.0:*                           847/ntpd
udp        0      0 10.49.4.103:39328       0.0.0.0:*                           1165/corosync
udp        0      0 10.49.4.103:47882       0.0.0.0:*                           1165/corosync
udp        0      0 10.49.4.103:58219       0.0.0.0:*                           1165/corosync
udp        0      0 10.49.4.103:52173       0.0.0.0:*                           1165/corosync
udp        0      0 10.49.4.103:5409        0.0.0.0:*                           1165/corosync
udp6       0      0 :::123                  :::*                                847/ntpd
root@hh1web03t ~#

I can ssh into the VIP, which, in this scenario, will take me to host hh1web03t:

root@hh1web04t ~# ssh 10.49.4.151
Last login: Thu Mar 29 16:19:39 2018 from hh1web03t
root@hh1web03t ~#

Here is the section of crm configure show, please note lvs_support is set to true:

primitive vip_151 IPaddr2 \
        params ip=10.49.4.151 cidr_netmask=24 iflabel=151 nic=ens192 lvs_support=true \
        meta target-role=Started \
        op monitor interval=30s
primitive vip_151_apache apache \
        params httpd="/usr/sbin/httpd" options="-d /etc/httpd" configfile="/etc/httpd/vip-151/httpd.conf" \
        op monitor interval=30s

I wanted to attach the corosync.log, but that took me over the limit of 30000 chracters for a question. So here is the one line that documents the failure (no saying it is the problem):

Mar 29 13:15:53  apache(vip_151_apache)[1897]:    ERROR: AH00180: WARNING: MaxRequestWorkers of 256 exceeds ServerLimit value of 100 servers, decreasing MaxRequestWorkers to 100. To increase, please see the ServerLimit directive. (99)Cannot assign requested address: AH00072: make_sock: could not bind to address 10.49.4.151:80 no listening sockets available, shutting down AH00015: Unable to open logs

I think I made a minor mistake in the setup, but I surely cannot find it.

Any help would be greatly appreciated!

Thank you all. Rudy

You could simplify the problem by having Apache listen on 0.0.0.0 and then control access as needed on each VIP at the firewall. Also, unrelated to your question, but a cluster should have an odd number of hosts to prevent a vote split during a failure. — brent, Mar 29 '18 at 15:37
I you discover how the "other cluster" is able to cheat crmd into registering a vip onto `lo` interface, it would be a valuable info. Please do not hesitate to self-answer how that is done. Normally a vip is only added to an interface if they are on the same IP subnet . — kubanczyk, Mar 29 '18 at 16:00

score 1 · Answer 1 · answered Mar 29 '18 at 15:44

In a corosync cluster Apache isn't going to run on each host, you need to bind it to the VIP resource so that it is only running on the active host, e.g.,

pcs constraint colocation add vip_151_apache vip_151 INFINITY

Or if you're only using crmsh

crm configure colocation apache_on_151 INFINITY: vip_151_apache vip_151

Corosync cluster will not bind vips to loopback, so apache fails to start in CentOS 7

1 Answers1