I'm trying to get Linux bonding working over a VPN (GRE-TAP). The funny thing is, that it only works when I have tcpdump
running on both hosts, but more on that later...
There are two machines, called pxn1
and pxn2
. They are connected together using a simple switch via eth1.
pxn1 has IP address 10.1.1.197
pxn2 has IP address 10.1.1.199
IPsec
To get a secure connection, all IP traffic is encrypted using IPsec. This works, I can ping between the two machines without any problem and tcpdump
shows only encrypted packets.
GRE-TAP
A GRE-TAP (tunnels Ethernet frames over IP) interface is then set up in both directions, because I will need a virtual network interface later on:
ip link add vpn_gre_pxn2 type gretap local 10.1.1.197 remote 10.1.1.199 dev eth1
ifconfig shows:
vpn_gre_pxn2 Link encap:Ethernet HWaddr 1a:73:32:7f:36:5f
inet6 addr: fe80::1873:32ff:fe7f:365f/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1462 Metric:1
RX packets:19 errors:0 dropped:0 overruns:0 frame:0
TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1294 (1.2 KiB) TX bytes:1916 (1.8 KiB)
This is on pxn1
. On the other host the same interface is set up in the other direction.
Bridge
A bridge is set up that currently uses only the GRE-TAP device.
I need the bridge because later on I want to add more machines (my plan is to bridge all GRE tunnels together). The end result should become a VPN mesh network (with a dedicated GRE-TAP interface for each host-host combination). But since for now I'm just doing a first test with two machines, the bridge is of course somewhat useless, but nonetheless important for the test itself.
brctl addbr vpn_br
brctl addif vpn_br vpn_gre_pxn2
The bridge works because when I activate the vpn_br
interface and set up some IP addresses (just for testing the bridge), ICMP PINGs work perfectly.
vpn_br Link encap:Ethernet HWaddr 02:00:0a:01:01:c5
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1462 Metric:1
RX packets:11 errors:0 dropped:0 overruns:0 frame:0
TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:448 (448.0 B) TX bytes:468 (468.0 B)
Bonding
A Linux Bonding interface is now set up. Again, since this is just a first proof of concept test, I'll only add a single slave to the bond.
Later on there will also be a real separate Gbit NIC with a dedicated switch that will act as the primary slave (with the VPN being just a backup), but for now the bonding interface will use the VPN only.
modprobe bonding mode=1 miimon=1000
ifconfig bond0 hw ether 02:00:0a:01:01:c5 # some dummy MAC
ifconfig bond0 up
ifconfig bond0 mtu 1462
ifenslave bond0 vpn_br # as said, only a single slive at the moment
ifconfig bond0 172.16.1.2/24 up
The other host is set up as 172.16.1.1/24 with HWaddr 02:00:0a:01:01:c7.
This results in a theoretically working bonding interface:
bond0 Link encap:Ethernet HWaddr 02:00:0a:01:01:c5
inet addr:172.16.1.2 Bcast:172.16.1.255 Mask:255.255.255.0
inet6 addr: fe80::aff:fe01:1c5/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1462 Metric:1
RX packets:11 errors:0 dropped:0 overruns:0 frame:0
TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:448 (448.0 B) TX bytes:468 (468.0 B)
The status also looks good to me:
# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.6.0 (September 26, 2009)
Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: vpn_br
MII Status: up
MII Polling Interval (ms): 1000
Up Delay (ms): 0
Down Delay (ms): 0
Slave Interface: vpn_br
MII Status: up
Speed: Unknown
Duplex: Unknown
Link Failure Count: 0
Permanent HW addr: 1a:73:32:7f:36:5f
Slave queue ID: 0
...as does the routing table:
# ip route show
192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.2
172.16.1.0/24 dev bond0 proto kernel scope link src 172.16.1.2
10.1.1.0/24 dev eth1 proto kernel scope link src 10.1.1.197
default via 10.1.1.11 dev eth1
NB: eht0
is a separate active NIC (Ethernet cross cable) but that should not matter IMHO.
The problem
The setup looks good to me, however, PING does not work (this was run on pxn1
):
# ping 172.16.1.1
PING 172.16.1.1 (172.16.1.1) 56(84) bytes of data.
From 172.16.1.2 icmp_seq=2 Destination Host Unreachable
From 172.16.1.2 icmp_seq=3 Destination Host Unreachable
From 172.16.1.2 icmp_seq=4 Destination Host Unreachable
While pinging, tcpdump
on the other machine (pxn2
) says:
# tcpdump -n -i bond0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on bond0, link-type EN10MB (Ethernet), capture size 65535 bytes
17:45:13.013791 ARP, Request who-has 172.16.1.1 tell 172.16.1.2, length 28
17:45:13.013835 ARP, Reply 172.16.1.1 is-at 02:00:0a:01:01:c7, length 28
17:45:14.013858 ARP, Request who-has 172.16.1.1 tell 172.16.1.2, length 28
17:45:14.013875 ARP, Reply 172.16.1.1 is-at 02:00:0a:01:01:c7, length 28
17:45:15.013870 ARP, Request who-has 172.16.1.1 tell 172.16.1.2, length 28
17:45:15.013888 ARP, Reply 172.16.1.1 is-at 02:00:0a:01:01:c7, length 28
However, when I also run tcpdump
on pxn1
in a separate terminal, I suddenly get my ICMP replies!
...
From 172.16.1.2 icmp_seq=19 Destination Host Unreachable
From 172.16.1.2 icmp_seq=20 Destination Host Unreachable
64 bytes from 172.16.1.1: icmp_req=32 ttl=64 time=0.965 ms
64 bytes from 172.16.1.1: icmp_req=33 ttl=64 time=0.731 ms
64 bytes from 172.16.1.1: icmp_req=34 ttl=64 time=1.00 ms
64 bytes from 172.16.1.1: icmp_req=35 ttl=64 time=0.776 ms
64 bytes from 172.16.1.1: icmp_req=36 ttl=64 time=1.00 ms
This only works as long as both machines have tcpdump
running. I can start/stop tcpdump
and consistenly only see replies while the program is running on both machines at the same time. It doesn't matter on which machine I try.
Is this a kernel bug or (more probable) has my configuration some problem?
Is it normal, that the bridge and bonding interface both show the same MAC address? I only configure it manually for the bonding interface, which apparently changes also the bridge..
FYI, config overview:
- for pxn1: http://pastebin.com/2Vw1VAhz
- for pxn2: http://pastebin.com/18RKCb9u
Update
I get a working setup when I set the bridge interface into promiscous mode (ifconfig vpn_br promisc
). I'm not quite sure if that is normally needed. OTOH I don't think it has any downsides...
BTW, there a similar RedHat bug report exists, but setting bond0
down/up doesn't help in my case..