6

This one's been bugging me for years.

Basic question: Is there some reason ARP has to be implemented with fixed timeouts on ARP cache entries?

I do a lot of work in Real Time ciricles. We do most of our inter-system communications these days on dedicated UDP/IP links. This for the most part works reliably in Real Time, but for one nit: ARP entry timeouts.

The way typical implementations do ARP is the following:

  • When client asks to send an IP packet to an IP address with an unkown MAC address, instead of sending that IP packet, the stack sends out an ARP request. If an upper layer (TCP) does resends, that's no problem. But since we use UDP, the original message is lost. At startup time this is OK, but in the middle of operation this is a Bad Thing™.
  • (Dynamic) ARP table entries are removed from the ARP table periodicly, even if we just got a packet from that system a millisecond ago. This means the Bad Thing™ happens to our system regularly.

The obvious solution (which we use religously) is to make all the ARP entries static. However, that's a royal PITA (particularly on RTOS's where finding an interface's MAC address is not always a matter of a couple of easy GUI clicks).

Back when we wrote our own IP stack, I solved this problem by never (ever) timing out ARP table entries. That has obvious drawbacks. A more robust and perfectly reasonable solution might be to refresh the entry timeout whenever a packet from the same MAC/IP combo is seen. That way an entry would only get timed-out if it hadn't communicated with the stack in that amount of time.

But now we're using our vendor's IP stack, and we're back to the stupid ARP timeouts. We have enough leverage with this vendor that I could perhaps get them to use a less inconvienient scheme. However, the universality of this brain-dead timeout algorithm leads me to believe it might be a required part of the implementation.

So that's the question. Is this behavior somehow required?

T.E.D.
  • 44,016
  • 10
  • 73
  • 134
  • 2
    I'd say the behavior of dropping the packet and instead doing an arp procedure is quite bad. e.g. Windows buffers only 1 packet during arp, while many other OSs does the more sane thing and buffers packets in the normal socket buffer during arp. – nos Mar 15 '13 at 22:45
  • @nos - Outgoing or incomming? As near as I can tell, outgoing *TCP* packets are buffered (because that's how TCP works, to ensure reliability). Outgoing UDP packets are just dropped. – T.E.D. Mar 19 '13 at 14:51
  • 1
    Any outgoing IP packet(for the destination), be it TCP or UDP. Ofcourse, TCP will detect and retransmit the dropped packets. – nos Mar 19 '13 at 21:31
  • In that case, agree heartily. Perhaps it isn't a big problem for your typical PC application, but in my Real Time world, its deadly. This has been the bane of my existence for 7 years or so now. I just had *another* possible weird ARP issue crop up this morning. :-( – T.E.D. Mar 19 '13 at 21:48

2 Answers2

4

RFC1122 Requirements for Internet Hosts discusses this.

     2.3.2.1  ARP Cache Validation

        An implementation of the Address Resolution Protocol (ARP)
        [LINK:2] MUST provide a mechanism to flush out-of-date cache
        entries.  If this mechanism involves a timeout, it SHOULD be
        possible to configure the timeout value.

      ...

       DISCUSSION:
             The ARP specification [LINK:2] suggests but does not
             require a timeout mechanism to invalidate cache entries
             when hosts change their Ethernet addresses.  The
             prevalence of proxy ARP (see Section 2.4 of [INTRO:2])
             has significantly increased the likelihood that cache
             entries in hosts will become invalid, and therefore
             some ARP-cache invalidation mechanism is now required
             for hosts.  Even in the absence of proxy ARP, a long-
             period cache timeout is useful in order to
             automatically correct any bad ARP data that might have
             been cached.

Networks can be very dynamic; DHCP servers can assign the same IP address to different computers when old lease times expire (making current ARP data invalid), there can be IP conflicts that will never be noticed unless ARP requests are periodically made, etc.

It also provides a mechanism for checking if a host is still on the network. Imagine you're streaming a video over UDP to some IP address 192.168.0.5. If you cache the MAC address of that machine forever, you'll just keep spamming out UDP packets even if the host goes down. Doing an ARP request every now and then will stop the stream with a destination unreachable error because no one responded with a MAC for that IP.

PherricOxide
  • 15,493
  • 3
  • 28
  • 41
  • 1
    After following [LINK:2](http://tools.ietf.org/html/rfc826.html) from your answer, I found a suggestion for the exact algorithm I'm talking about (which as far as I know, no stack implements). So I guess that is my answer. – T.E.D. Mar 19 '13 at 21:30
  • From [RFC 826](http://tools.ietf.org/html/rfc826.html) "Or perhaps receiving of a packet from a host should reset a timeout in the address resolution entry used for transmitting packets to that host; if no packets are received from a host for a suitable length of time, the address resolution entry is forgotten." – T.E.D. Mar 19 '13 at 21:32
  • 1
    @T.E.D. Yes you can do that. Many operating systems do this, and have a positive feedback from incoming packets into the ARP cache. – nos Mar 19 '13 at 21:33
  • OK. This is what I needed. I wish I could accept both answers. This one better answers what I was asking, but Ross' is more likely to be helpful to me (I perhaps have enough pull with the vendor to get them to change their algorithm, but that will take a long time, and will only solve the issue on one side). – T.E.D. Mar 19 '13 at 21:41
2

It originated in distrust of routing protocols, especially in the non-Ethernet world (especially MIT's CHAOS networks). Chris Moon, one of the early "ARPAnauts" was quoted specifically about this in the original ARP RFC.

You can, of course, keep the other guys' ARP caches from timing out by proactively broadcasting your own ARP announcements. Most Ethernet layers will accept gratuitous ARP responses into their caches without trying to correlate them to ARP requests they have previously sent.

Ross Patterson
  • 9,527
  • 33
  • 48
  • Interesting. I hadn't thought of that solution. Of course some OS's have an annoying tendency to make it tough to get MAC addresses programatically. I'll have to look to see how ours handles it. – T.E.D. Mar 17 '13 at 03:54
  • OOC, can you typically do this to **yourself**? By this, I mean broadcast an ARP response (perhaps to the loopback) with someone else's MAC/IP combo to keep that entry from timing out of your own table. I ask because keeping our UDP link deterministic relies on one machine being the acknowledged master on the link. The other machine can't talk unless the master requests, so that other system isn't really allowed to send his own ARPs on his own schedule. – T.E.D. Mar 19 '13 at 14:44
  • 1
    If your OS and network drivers don't prevent you, yes, you can: [Wikipedia: ARP announcements](http://en.wikipedia.org/wiki/Address_Resolution_Protocol#ARP_announcements) – Ross Patterson Mar 19 '13 at 18:14
  • Heh. Already had that exact link loaded in another browser tab, but thanks. :-) – T.E.D. Mar 19 '13 at 19:22