3

I get this output from traceroute:

#traceroute -i eth1 -s 192.168.12.14 192.168.1.72

1  192.168.12.1 (192.168.12.1)  1.410 ms  2.076 ms  2.251 ms
2  * * *
3  * * *
etc..

But in another terminal I can see the correct replies (Port Unreachable) arriving from the target host:

9.964867 192.168.1.72 -> 192.168.12.14 ICMP 102 Destination unreachable (Port unreachable)     
9.964879 192.168.1.72 -> 192.168.12.14 ICMP 102 Destination unreachable (Port unreachable)
9.964886 192.168.1.72 -> 192.168.12.14 ICMP 102 Destination unreachable (Port unreachable)
9.964904 192.168.1.72 -> 192.168.12.14 ICMP 102 Destination unreachable (Port unreachable)
9.964923 192.168.1.72 -> 192.168.12.14 ICMP 102 Destination unreachable (Port unreachable)
9.964927 192.168.1.72 -> 192.168.12.14 ICMP 102 Destination unreachable (Port unreachable)

At first I thought it was a firewall issue but I checked and no packets are being dropped. The only thing that comes to mind is that this is the second NIC...

If I run traceroute to the same host on the first NIC I get the same wireshark trace as above (obviously with a different source IP) -- but the traceroute command succeeds.

I don't understand how wireshark can see the replies but traceroute fails on the second NIC.

I think I'm missing something pretty basic here....

David Semeria
  • 219
  • 2
  • 8

2 Answers2

3

Wireshark will show what arrives on the network interface. The kernel has obviously seen those packets, but for some reason decided that they are not to be delivered to the traceroute command.

There is a few things that could have gone wrong causing the kernel to decide not to deliver those packets.

  • You may have an asymmetrical routing which is not suitable for reverse path filtering, but have left rp_filter enabled.
  • The kernel may not be able to match the content of the ICMP error message with a local socket. This could happen due to the packet having been truncated with insufficient information available to make such decision. This could also happen due to some broken NAT configuration where packets in one direction gets routed through a NAT but not in the other direction.
  • The kernel may drop the packets due to bad checksum.

Of those I think the rp_filter sounds like the most likely explanation. You didn't specify an operating system, but it looks like it might be a Linux system, so try this command: head /proc/sys/net/ipv4/conf/*/rp_filter. You would likely see 1 on every one of them, meaning the filter is enabled. Try writing a 0 to the one corresponding to the interface the packets are being dropped from as well as to the all device name.

kasperd
  • 30,455
  • 17
  • 76
  • 124
  • That's a fantastic answer! I tried setting rp_filter to 0 for eth1 but nothing changed. I don't think I've got an asymmetric routing problem because all the hosts on all the subnets have the same L3 switch as their default gateway. In fact, the topology for eth0 is identical to eth1. At the moment the default route is via eth0, but if I change it to eth1 then all the above works but the problem just moves to eth0. I'm convinced it has something to do with eth1 not being the 'default' NIC. BTW, /proc/sys/net/ipv4/ip_forward is intentionally zero. – David Semeria May 15 '14 at 20:28
  • @DavidSemeria I forgot, you have to write `0` not only to the one corresponding to the device dropping the packets but also to the one named `all`. This is because a `1` in `/proc/sys/net/ipv4/conf/all/rp_filter` actually means filter enabled on all interfaces. The value of `ip_forward` should have nothing to say in your particular case. – kasperd May 15 '14 at 21:09
  • ABSOLUTELY AMAZING - IT WORKS!! I've been bashing my head on this problem all week and just today spent 12 hours on it. God bless you @kaspered you are brilliant! – David Semeria May 15 '14 at 21:17
0

Can you post the output of the routing table of each box here? If you have an invalid / missing default route, the packets will not have a return path. Please post the output of:

# ip route list

on each Linux box.

scott--
  • 107
  • 5