this one bugs me for a couple of days now and I cannot find a solution. Sorry for the strange title. This isn't my main field of work and I currently cannot think of a better title.
Some facts:
- I'm a user on a shared openstack environment
- I cannot see or change the configuration of the underlaying openstack setup
- The VMs are configured via cloud-init which installs python-minimal, creates a user and does an
apt-get dist-upgrade
. Apart from that they are configured via DHCP with a static IP. - I configured no
iptables
rules on the nodes.
So, let me describe the setup:
I created a network+subnet (10.0.30.10/24). The network is attached to a router which holds two static routes. I also created two VMs (both ubuntu 16.04.2 LTS) which got their "main" IP (node0: 10.0.30.10/24 and node1: 10.0.30.11/24) and also a second IP in a different subnet (node0: 10.10.10.2/24 and node1: 10.10.20.2/24).
I also configured two static routes on the router which forward everything for 10.10.10.0/24
to node0 and everything for 10.10.20.0/24
to node1.
+----------------------------------------+
| test-router |
| IP: 10.0.30.1/24 |
| |
| Static routes: |
| - destination_cidr = "10.10.10.0/24" |
| next_hop = "10.0.30.10" |
| - destination_cidr = "10.10.20.0/24" |
| next_hop = "10.0.30.11" |
+----------------------------------------+
|
|
+------------------------+
| test-network |
| Subnet: 10.0.30.0/24 |
| Router: 10.0.30.1 |
+------------------------+
|
|
| +---------------------+
| | node0 |
+-------+ IP: 10.0.30.10/24 |
| | 10.10.10.2/24 |
| +---------------------+
|
| +---------------------+
| | node1 |
+-------+ IP: 10.0.30.11/24 |
| 10.10.20.2/24 |
+---------------------+
After both VMs are bootet I can observe the following:
Node0
$ route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default 10.0.30.1 0.0.0.0 UG 0 0 0 ens3
10.0.30.0 * 255.255.255.0 U 0 0 0 ens3
169.254.169.254 10.0.30.100 255.255.255.255 UGH 0 0 0 ens3
$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:31:67:52 brd ff:ff:ff:ff:ff:ff
inet 10.0.30.10/24 brd 10.0.30.255 scope global ens3
valid_lft forever preferred_lft forever
inet 10.10.10.2/24 scope global ens3
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fe31:6752/64 scope link
valid_lft forever preferred_lft forever
$ ping -c10 10.10.20.2
PING 10.10.20.2 (10.10.20.2) 56(84) bytes of data.
From 10.0.30.1: icmp_seq=2 Redirect Host(New nexthop: 10.0.30.11)
From 10.0.30.1: icmp_seq=3 Redirect Host(New nexthop: 10.0.30.11)
--- 10.10.20.2 ping statistics ---
10 packets transmitted, 0 received, 100% packet loss, time 8999ms
$ route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default 10.0.30.1 0.0.0.0 UG 0 0 0 ens3
10.0.30.0 * 255.255.255.0 U 0 0 0 ens3
10.10.10.0 * 255.255.255.0 U 0 0 0 ens3
169.254.169.254 10.0.30.100 255.255.255.255 UGH 0 0 0 ens3
Meanwhile on node1
# tcpdump icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens3, link-type EN10MB (Ethernet), capture size 262144 bytes
09:25:55.451876 IP 10.0.30.10 > 10.10.20.2: ICMP echo request, id 1271, seq 1, length 64
09:25:55.451928 IP 10.10.20.2 > 10.0.30.10: ICMP echo reply, id 1271, seq 1, length 64
09:25:56.451467 IP 10.0.30.10 > 10.10.20.2: ICMP echo request, id 1271, seq 2, length 64
09:25:56.451503 IP 10.10.20.2 > 10.0.30.10: ICMP echo reply, id 1271, seq 2, length 64
09:25:57.451185 IP 10.0.30.10 > 10.10.20.2: ICMP echo request, id 1271, seq 3, length 64
09:25:57.451218 IP 10.10.20.2 > 10.0.30.10: ICMP echo reply, id 1271, seq 3, length 64
[..]
09:26:03.450910 IP 10.0.30.10 > 10.10.20.2: ICMP echo request, id 1271, seq 9, length 64
09:26:03.450943 IP 10.10.20.2 > 10.0.30.10: ICMP echo reply, id 1271, seq 9, length 64
09:26:04.450988 IP 10.0.30.10 > 10.10.20.2: ICMP echo request, id 1271, seq 10, length 64
09:26:04.451022 IP 10.10.20.2 > 10.0.30.10: ICMP echo reply, id 1271, seq 10, length 64
So, my conclusion is: node1 receives the traffic but the reply doesn't make its way to node0.
The same happens if I start a webserver on node1 and try to curl it via the statically routed IP. I see traffic coming in on node1 but the response never makes it to node0.
On the other hand: arping
from node0 to node1 works:
# arping -c3 -i ens3 10.10.20.2
ARPING 10.10.20.2
42 bytes from fa:16:3e:a9:b4:bc (10.10.20.2): index=0 time=7.933 msec
42 bytes from fa:16:3e:a9:b4:bc (10.10.20.2): index=1 time=2.797 msec
42 bytes from fa:16:3e:a9:b4:bc (10.10.20.2): index=2 time=9.703 msec
--- 10.10.20.2 statistics ---
3 packets transmitted, 3 packets received, 0% unanswered (0 extra)
rtt min/avg/max/std-dev = 2.797/6.811/9.703/2.929 ms
If I use the "main" IP, everything works fine.
Things I tried (on both nodes):
- Setting
/proc/sys/net/ipv4/conf/ens3/rp_filter
to2
and0
(because I'm desperate). Nothing changed. - Setting
/proc/sys/net/ipv4/ip_forward
to1
. Nothing changed. - Setting
/proc/sys/net/ipv4/conf/ens3/log_martians
to1
on both nodes. No output viajournalctl -f
whatsoever.
EDIT: There is output on node0 if I ping node1 via the static IP:
May 03 11:16:01 node0 kernel: IPv4: Redirect from 10.0.30.1 on ens3 about 10.0.30.11 ignored
Advised path = 10.0.30.10 -> 10.10.20.2
And since I'm running out of ideas, I need your help. Thanks for taking the time looking into my problem!