1

I'm running Corosync + Pacemaker on Ubuntu 14.4. I set up two nodes with two VIPs, and when I bring pacemaker down in one node, the VIPs do go to the other, but no traffic actually goes through the system until I manually run ifdown eth1 and ifup eth1 (the VIPs are on eth1) on that node. This is true for both nodes.

I can see that the VIPs did get transferred ok using ifconfig. But running bmon I see that no traffic goes through until I run ifdown / ifup.

crm configure show:

node $id="6" node1
node $id="3" node2
primitive VIP_3 ocf:heartbeat:IPaddr2 \
    params ip="10.0.1.112" nic="eth1" iflabel="3" \
    op monitor interval="10s" on-fail="restart" \
    op start interval="0" timeout="1min" \
    op stop interval="0" timeout="30s"
primitive VIP_4 ocf:heartbeat:IPaddr2 \
    params ip="10.0.1.111" nic="eth1" iflabel="4" \
    op monitor interval="10s" on-fail="restart" \
    op start interval="0" timeout="1min" \
    op stop interval="0" timeout="30s"
property $id="cib-bootstrap-options" \
    dc-version="1.1.10-42f2063" \
    cluster-infrastructure="corosync" \
    no-quorum-policy="ignore" \
    stonith-enabled="false"

crm_mon -1:

Last updated: Mon Feb 16 16:16:42 2015
Last change: Mon Feb 16 15:43:30 2015 via crmd on node1
Stack: corosync
Current DC: node1 (6) - partition with quorum
Version: 1.1.10-42f2063
2 Nodes configured
2 Resources configured


Online: [ node1 node2 ]

 VIP_4  (ocf::heartbeat:IPaddr2):   Started node1 
 VIP_3  (ocf::heartbeat:IPaddr2):   Started node1 

ifconfig (on node1):

eth0      Link encap:Ethernet  HWaddr 00:0c:29:b2:19:ba  
          inet addr:10.0.0.192  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:29ff:feb2:19ba/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:253948 errors:0 dropped:73 overruns:0 frame:0
          TX packets:116222 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:95400133 (95.4 MB)  TX bytes:20760101 (20.7 MB)

eth1      Link encap:Ethernet  HWaddr 00:0c:29:b2:19:c4  
          inet6 addr: fe80::20c:29ff:feb2:19c4/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:24763724 errors:0 dropped:19558 overruns:0 frame:0
          TX packets:23253310 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:15916162148 (15.9 GB)  TX bytes:15816322712 (15.8 GB)

eth1:3    Link encap:Ethernet  HWaddr 00:0c:29:b2:19:c4  
          inet addr:10.0.1.112  Bcast:10.0.1.240  Mask:255.255.255.255
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

eth1:4    Link encap:Ethernet  HWaddr 00:0c:29:b2:19:c4  
          inet addr:10.0.1.111  Bcast:10.0.1.239  Mask:255.255.255.255
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:62428 errors:0 dropped:0 overruns:0 frame:0
          TX packets:62428 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:8020634 (8.0 MB)  TX bytes:8020634 (8.0 MB)

/etc/network/interfaces:

auto lo
iface lo inet loopback
auto eth0
iface eth0 inet static
    address 10.0.0.192
    netmask 255.255.255.0
    gateway 10.0.0.138
    dns-nameservers 8.8.8.8 8.8.4.4
auto eth1
iface eth1 inet manual
    post-up ip route add 10.0.1.0/24 dev eth1 table 11
    post-up ip rule add from 10.0.1.0/24 table 11
    post-up ip rule add to 10.0.1.0/24 table 11
    pre-down ip rule delete table 11
    pre-down ip rule delete table 11
    pre-down ip route flush table 11

Any ideas of what I'm doing wrong? I would expect that after Pacemaker starts up the IP addresses, traffic will start flowing through them without need for running ifdown eth1 and ifup eth1.

Thanks!

moomima
  • 497
  • 2
  • 5
  • 9
  • first you need to give some more clear information about your cluster configuration, what is your cluster private network? eth0 or eth1? – c4f4t0r Feb 16 '15 at 14:39
  • I have two nodes in the cluster, each running a custom web server of sorts... Corosync communicates on eth0, while the VIPs are on eth1. Let me know if more info is needed. Thanks! – moomima Feb 16 '15 at 16:33
  • http://www.hastexo.com/resources/hints-and-kinks/network-connectivity-check-pacemaker – c4f4t0r Feb 16 '15 at 18:57

1 Answers1

0

I'd say that this is an ARP cache issue. Your "previous" host keeps the ARP entry for MAC address vs. IP address of your active node.

You have 3 options :

  • flush the arp cache on every client that has previously communicated with the VIP
  • create a new sub-interface of eth1 and configure the same MAC address for both servers but ensure that only one interface is active at any given time
  • broadcast an arp request to the affected network - where clients should update arp tables with new MAC:IP entry for the VIP.

For reference please see how VRRP protocol handles the VIP transfer.

Roman Spiak
  • 583
  • 3
  • 11
  • So what should he do about that? – Andrew Schulman Mar 06 '18 at 14:52
  • You have 3 options : * flush the arp cache on every client that has previously communicated with the VIP * create a new sub-interface of eth1 and configure the same MAC address for both servers but ensure that only one interface is active at any given time * broadcast an arp request to the affected network - where clients should update arp tables with new MAC:IP entry for the VIP. For reference please see how VRRP protocol handles the VIP transfer. – Roman Spiak Mar 07 '18 at 17:50
  • Please add that to your answer. – Andrew Schulman Mar 07 '18 at 18:18