13

I'm building an embedded system for a camera controller in Linux (not real-time). I'm having a problem getting the networking to do what I want it to do. The system has 3 NICs, 1 100base-T and 2 gigabit ports. I hook the slower one up to the camera (that's all it supports) and the faster ones are point-to-point connections to other machines. What I am attempting to do is get an image from the camera, do a little processing, then broadcast it using UDP to each of the other NICs.

Here is my network configuration:

eth0: addr: 192.168.1.200 Bcast 192.168.1.255 Mask: 255.255.255.0 (this is the 100base-t)
eth1: addr: 192.168.2.100 Bcast 192.168.2.255 Mask: 255.255.255.0
eth2: addr: 192.168.3.100 Bcast 192.168.3.255 Mask: 255.255.255.0

The image is coming in off eth0 in a proprietary protocol, so it's a raw socket. I can broadcast it to eth1 or eth2 just fine. But when I try to broadcast it to both, one after the other, I get lots of network hiccups and errors on eth0.

I initialize the UDP sockets like this:

sock2=socket(AF_INET,SOCK_DGRAM,IPPROTO_UDP); // Or sock3
sa.sin_family=AF_INET;
sa.sin_port=htons(8000);
inet_aton("192.168.2.255",&sa.sin_addr); // Or 192.168.3.255
setsockopt(sock2, SOL_SOCKET, SO_BROADCAST, &broadcast, sizeof(broadcast));
bind(sock2,(sockaddr*)&sa,sizeof(sa));

sendto(sock2,&data,sizeof(data),0,(sockaddr*)&sa,sizeof(sa)); // sizeof(data)<1100 bytes

I do this for each socket separately, and call sendto separately. When I do one or the other, it's fine. When I try to send on both, eth0 starts getting bad packets.

Any ideas on why this is happening? Is it a configuration error, is there a better way to do this?

EDIT: Thanks for all the help, I've been trying some things and looking into this more. The issue does not appear to be broadcasting, strictly speaking. I replaced the broadcast code with a unicast command and it has the same behavior. I think I understand the behavior better, but not how to fix it.

Here is what is happening. On eth0 I am supposed to get an image every 50ms. When I send out an image on eth1 (or 2) it takes about 1.5ms to send the image. When I try to send on both eth1 and eth2 at the same time it takes about 45ms, occasionally jumping to 90ms. When this goes beyond the 50ms window, eth0's buffer starts to build. I lose packets when the buffer gets full, of course.

So my revised question. Why would it go from 1.5ms to 45ms just by going from one ethernet port to two?

Here is my initialization code:

sock[i]=socket(AF_INET,SOCK_DGRAM,IPPROTO_UDP);
sa[i].sin_family=AF_INET;
sa[i].sin_port=htons(8000);
inet_aton(ip,&sa[i].sin_addr);

//If Broadcasting
char buffer[]="eth1" // or eth2
setsockopt(sock[i],SOL_SOCKET,SO_BINDTODEVICE,buffer,5);
int b=1;
setsockopt(sock[i],SOL_SOCKET,SO_BROADCAST,&b,sizeof(b));

Here is my sending code:

for(i=0;i<65;i++) {
  sendto(sock[0],&data[i],sizeof(data),0,sa[0],sizeof(sa[0]));
  sendto(sock[1],&data[i],sizeof(data),0,sa[1],sizeof(sa[1]));
}

It's pretty basic.

Any ideas? Thanks for all your great help!

Paul

user1044200
  • 141
  • 1
  • 6
  • Is your second `sendto` using the same unmodified `sa`? – Hasturkun Nov 13 '11 at 14:20
  • No, each socket has their own sa. – user1044200 Nov 13 '11 at 14:36
  • I mean, is the `sa` set in lines 2-3 used as is in the `sendto`, or is it re-set? – Hasturkun Nov 13 '11 at 14:56
  • Yes, the sa set in lines 2-3 is used as-is in the sendto for sock2. For sock3, there is another sa differing only by the sa.sin_addr that is used for the sendto for sock3. – user1044200 Nov 13 '11 at 15:09
  • 2
    Side question: why use broadcast rather than multicast? – John Zwinck Nov 13 '11 at 17:31
  • The reason for broadcast rather than multicast was to allow any device to attach to the system and receive messages. That being said, I tried to re-implement this using two multicast networks and it didn't change anything. – user1044200 Nov 13 '11 at 20:01
  • Are these NICs on separate wires? If they are on the same wire, the masks are set such that broadcasts will confuse the other NICs. Increase the mask enough to expose the differentiated IP addresses. – wallyk Nov 13 '11 at 20:37
  • increase the receiving buffer, that'll solve the packet drop problems... – Karoly Horvath Nov 14 '11 at 22:07
  • It looks like you're trying to bind the socket to a broadcast address, which makes no sense -- you almost certainly want to bind it to `INADDR_ANY` so it can send/recieve data on any interface. Then use multiple sento calls to send to each breadcast address. – Chris Dodd Nov 14 '11 at 22:45
  • As this is Linux, maybe you could try using the `tee(2)`/`splice(2)` syscalls? – fge Dec 15 '11 at 20:05
  • Do eth0, eth1, or eth2 share a chip? That is, are they separate hardware paths that can truly run in parallel, or do at least two of them share hardware? Could that sharing cause the time delay that you see? – mpez0 Dec 27 '11 at 18:28

3 Answers3

1

It's been a long time, but I found the answer to my question, so I thought I would put it here in case anyone else ever finds it.

The two Gigabit Ethernet ports were actually on a PCI bridge off the PCI-express bus. The PCI-express bus was internal to the motherboard, but it was a PCI bus going to the cards. The bridge and the bus did not have enough bandwidth to actually send out the images that fast. With only one NIC enabled the data was sent to the buffer and it looked very quick to me, but it took much longer to actually get through the bus, out the card, and on to the wire. The second NIC was slower because the buffer was full. Although changing the buffer size masked the problem, it did not actually send the data out any faster and I was still getting dropped packets on the third NIC.

In the end, the 100Base-T card was actually built onto the motherboard, therefore had a faster bus to it, resulting in overall faster bandwidth than the gigabit ports.. By switching the camera to a gigabit line and one of the gigabit lines to the 100Base-T line I was able to meet the requirements.

Strange.

user1044200
  • 141
  • 1
  • 6
1

Maybe your UDP stack runs out of memory?

(1) Check /proc/sys/net/ipv4/udp_mem (see man 7 udp for details). Make sure that the first number is at least 8x times the image size. This sets the memory for all UDP sockets in the system.

(2) Make sure you per-socket buffer for sending socket is big enough. Use setsockopt(sock2, SOL_SOCKET, SO_SNDBUF, image_size*2) to set send buffer on both sockets. You might need to increase maximumu allowed value in /proc/sys/net/core/wmem_max. See man 7 socket for details.

(3) You might as well increase RX buffer for receiving socket. Write a big number to .../rmem_max, then use SO_RCVBUF to increase the receiving buffer size.

theamk
  • 1,420
  • 7
  • 14
1

A workaround until this issue is actually solved may be to createa bridge for eth1+eth2 and send the packet to that bridge. Thus it's only mapped to kernel-memory once and not twice per image.

Marcus Wolschon
  • 2,550
  • 2
  • 22
  • 28