I have an application that is running a raw IP socket, destination of this socket is governed by routes installed via the 'ip route add' command. These routes can change during the lifetime of a socket (e.g. because next-hop changes)
Simplified, lets say I have 2 interfaces, eth0
and eth1
. I also have a default route via eth0
.
The raw socket's endpoint is for example 10.10.10.10
, eth1 has address 100.0.0.1
, I do the following during the raw socket's lifetime:
ip -f inet route delete 10.10.10.10
ip -f inet route add 100.0.0.2 dev eth1
ip -f inet route add 10.10.10.10/32 via 100.0.0.2 dev eth1
Now what I see is that after this operation traffic goes correctly via eth1
for a few seconds, then it goes wrong (via eth0) for a short while (less than half a second) and then it's correct again (as far as I can see permanently).
So my main question is:
-Can anybody give an explanation of what might go wrong here? I tried adding ip route flush cache
after the sequence mentioned before but that didn't do anything. I'm currently puzzled as why traffic sometimes gets dropped. I think it's either a timing issue in the routing commands or some other trigger disabling the route for a split second, but I'm running out of options.
I did try to use the SO_BINDTODEVICE
option on my raw socket, but alas this didn't help much, the main difference is that when traffic goes wrong it isn't sent out at all, because it would go over the wrong interface. However, what I hoped for was that this would set errno to something like E_CANNOTROUTE (this doesn't exist) so I could catch this and retry sending the packet. It currently does not do this, but is there a way I could catch such a failure? I have (almost) full control over the system and the application that runs the socket.
One solution I know would work would be to not use L3 raw sockets but AF_PACKET
sockets (and also do ARP/ND myself) but I'd rather not go into that just yet.
Update
I have improved behavior in my system, by changing this route change behavior. When I have to update the next-hop I now look to the already installed route and take action based on that:
- If it's not there I just install the new route and skip the delete.
- If the exact route is already present (same nh, same dev), I now do nothing.
- If another nh is present for this route I now do a more specific delete for just this nh followed by an add.
While this stabilized most of my issues, I still sometimes see the same thing happen (though much less often) when an actual delete+add happens (last case in the new mechanism). Also, this actually still does not explain what goes wrong (it merely circumvents it), so I'll leave this question open for now as I'm really curious what goes wrong here.
FYI: I have the issue on centos, as far as I can see going from centos4 to centos6, 32-bit.