2

I have a program which attempts to contact to an embedded device over UDP. The embedded device has a link-local address only (169.254..); the Linux host has a normal (DHCP, RFC1918) address, managed by NetworkManager on ubuntu natty. This local connection is configured to 'use this connection only for resources on the local network'. My program sends a broadcast packet on one socket, then waits on a unicast socket (bound to the local address, not connect()ed) for an incoming beacon packet

At times, I find that the Linux program does not receive packets from the link-local address of the embedded device. Wireshark shows that they are arriving on the incoming interface and are well-formed, but they are not received. Packets sent locally both from and to the RFC1918 local address are, however received, as are packets from other RFC1918 hosts on the same netowrk.

I also find that, upon rebooting, this condition usually spontaneously corrects itself; I can once again receive packets from link-local addresses. Sometimes it also spontaneously corrects itself after just waiting some time.

Is there some obscure route setting or something that could cause the incoming packets to be lost? Outgoing packets work fine (probably because I'm bypassing routing when sending packets).

Correlating the last case of spontaneous restoration, I find this in the logs:

Jul 13 20:58:01 hakase NetworkManager[933]: <info> (eth0): DHCPv4 state changed preinit -> reboot
Jul 13 20:58:01 hakase NetworkManager[933]: <info> Activation (eth0) Stage 4 of 5 (IP4 Configure Get) scheduled...
Jul 13 20:58:01 hakase NetworkManager[933]: <info> Activation (eth0) Stage 4 of 5 (IP4 Configure Get) started...
Jul 13 20:58:01 hakase NetworkManager[933]: <info>   address 192.168.0.148
Jul 13 20:58:01 hakase NetworkManager[933]: <info>   prefix 24 (255.255.255.0)
Jul 13 20:58:01 hakase NetworkManager[933]: <info>   gateway 192.168.0.1
Jul 13 20:58:01 hakase NetworkManager[933]: <info>   nameserver '192.168.0.1'
Jul 13 20:58:01 hakase NetworkManager[933]: <info>   domain name 'mshome.net'
Jul 13 20:58:01 hakase NetworkManager[933]: <info> Scheduling stage 5
Jul 13 20:58:01 hakase NetworkManager[933]: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) scheduled...
Jul 13 20:58:01 hakase NetworkManager[933]: <info> Done scheduling stage 5
Jul 13 20:58:01 hakase NetworkManager[933]: <info> Activation (eth0) Stage 4 of 5 (IP4 Configure Get) complete.
Jul 13 20:58:01 hakase NetworkManager[933]: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) started...
Jul 13 20:58:01 hakase avahi-daemon[862]: Joining mDNS multicast group on interface eth0.IPv4 with address 192.168.0.148.
Jul 13 20:58:01 hakase avahi-daemon[862]: New relevant interface eth0.IPv4 for mDNS.
Jul 13 20:58:01 hakase avahi-daemon[862]: Registering new address record for 192.168.0.148 on eth0.IPv4.
Jul 13 20:58:02 hakase NetworkManager[933]: <info> Policy set 'Auto dfn3' (wlan0) as default for IPv4 routing and DNS.
Jul 13 20:58:02 hakase NetworkManager[933]: <info> (eth0): device state change: 7 -> 8 (reason 0)
Jul 13 20:58:02 hakase NetworkManager[933]: <info> Activation (eth0) successful, device activated.
Jul 13 20:58:02 hakase NetworkManager[933]: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) complete.
Jul 13 20:58:03 hakase postfix/master[1245]: reload -- version 2.8.2, configuration /etc/postfix
[these next two lines are likely associated with the wireshark session I have running]
Jul 13 20:58:09 hakase kernel: [37294.962058] device eth0 left promiscuous mode
Jul 13 20:58:10 hakase kernel: [37295.323279] device eth0 entered promiscuous mode
Jul 13 20:58:11 hakase ntpdate[23459]: adjust time server 91.189.94.4 offset -0.024960 sec
Jul 13 21:02:40 hakase dhclient: DHCPREQUEST of 192.168.0.148 on eth0 to 192.168.0.1 port 67
Jul 13 21:02:40 hakase dhclient: DHCPACK of 192.168.0.148 from 192.168.0.1
Jul 13 21:02:40 hakase dhclient: bound to 192.168.0.148 -- renewal in 248 seconds.
Jul 13 21:02:40 hakase NetworkManager[933]: <info> (eth0): DHCPv4 state changed reboot -> renew
Jul 13 21:02:40 hakase NetworkManager[933]: <info>   address 192.168.0.148
Jul 13 21:02:40 hakase NetworkManager[933]: <info>   prefix 24 (255.255.255.0)
Jul 13 21:02:40 hakase NetworkManager[933]: <info>   gateway 192.168.0.1
Jul 13 21:02:40 hakase NetworkManager[933]: <info>   nameserver '192.168.0.1'
Jul 13 21:02:40 hakase NetworkManager[933]: <info>   domain name 'mshome.net'
[at approximately one second later the connection to the link-local device was established]

Could this 'reboot' state be linked with the problem somehow?

bdonlan
  • 693
  • 7
  • 14

4 Answers4

2

Statically assign the local address on the Linux host and see if this goes away. Take DHCP out of the picture. At worst you won't get the "spontaneous restoration" effect when it stops working but at least you can cross worries about DHCP off your list.

And, if you want, try assigning a 169.254/16 address in addition and see if that helps.

Mark
  • 2,248
  • 12
  • 15
  • I'll give it a shot next time it happens; the intermittent nature of the issue makes debugging quite difficult :/ – bdonlan Jul 21 '11 at 22:25
2

There's a lot of convoluted information there, and the only question I see is: "Is there some obscure route setting or something that could cause the incoming packets to be lost?"

What is your real question? I will address the question: "I'm trying to contact 169.254.100.15 from 192.168.1.101. Why can't I contact it?"

Socket communication works over TCP, right?

In order for two hosts on separate subnets to speak to each other, they need to be routed.

Link-local addresses (169.254.0.0/16) do not get routed ever (http://en.wikipedia.org/wiki/Link-local_address).

You can not speak to an address on 169.254.0.0/16 from any other subnet. No way, no how. Not now, not ever.

Additionally: I just thought that you can look into using a loop-back and address packets toward the interface like that.

brandeded
  • 1,845
  • 8
  • 32
  • 50
  • Per RFC3927 3.3, link local addresses should be contactable from non-link-local addresses. Moreover, I'm having problems on receive, the peer's packets are definitely reaching the linux host, and all sends bypass routing tables... – bdonlan Jul 21 '11 at 22:25
  • 1
    With respect, I don't think RFC3927s3.3 says what you think. It notes that "a host with an IPv4 Link-Local address may send to a destination which does not have an IPv4 Link-Local address", but it says nothing about the other direction; indeed, it adds that "any host conforming to this specification knows that regardless of source address an IPv4 Link-Local destination must be reached by forwarding directly to the destination, not via a router". I'm with mbrownnyc here; if he's described your problem correctly, then 192.168.1.101 should not be able to unicast to 169.254.100.15. – MadHatter Jul 22 '11 at 08:59
  • Reading RFCs is fun, running a traceroute and packet sniffing is better. Your problem is quite confusing, as a problem related to network is highly unlikely to be intermittent. Using gliffy.com and drawing a diagram of your topology would be extremely useful. I'm still unclear whether or not the two addresses live on the same host (can be routed within the stack on the same box), or they are on separate hosts and packets must be routed through an external router. – brandeded Jul 22 '11 at 13:52
0

What hits me in your output is the renewal of your IP in 248 seconds.

So do your problems start after these 248 seconds?

If so the dhcp-client might do some unwanted configuration changes when the renewal hits.

Is there any reason for this very short time frame?

Nils
  • 7,695
  • 3
  • 34
  • 73
  • The network in question is a test network; the DHCP server is from windows XP's internet connection sharing (the real internet connection is on another port). No idea why XP chooses that renewal interval... – bdonlan Jul 20 '11 at 20:26
0

Try running avahi-autoipd to have your machine assign it's self a 169.254/16 address, you should be able to talk to other hosts in 169.254/16 on your local network then.

JasperWallace
  • 214
  • 1
  • 4