Receiving response(s) from N number of clients in reply to a broadcast request over UDP

Question

I am implementing a kind of IP finder for a particular type of network multimedia device. I want to find out all the alive devices of that type in the LAN, with their IP address and other details.

The device has its own way of device discovery.

It works as follows: A client sends a broadcast request over the LAN via UDP.
The destination port number is fixed.
In reply, all the servers in the LAN that understand the format of this request will respond to this request providing information about themselves.

I am broadcasting the UDP request message using sendto().

Now my problem is that I don't know how many devices (i.e.servers) will respond to the request.

How many times will I have to call recvfrom()?
When will I come to know that I have handled the response from all the devices?
Or in general, is recvfrom() the right choice for receiving response from multiple servers?
Is there any better (or CORRECT if I am wrong here) way of accomplishing the same?

I am programming in C/C++, planning to code for both Windows and Linux.
Many thanks in advance.

Edit: So with the help of all the network programming wizards out here, I have found the solution to my problem :)
select() is just the thing for me...
Thanks a lot to all of you who took out time to help me

I think, after discovery phase you want to communicate to these devices? Why not to use IP multicast from the beginning (see http://en.wikipedia.org/wiki/Multicast)? — dma_k, Mar 02 '10 at 16:21
The server is implemented in firmware. I have no option but to follow the protocol laid down by the device manufacturer. I have to broadcast the request over UDP :( — puffadder, Mar 02 '10 at 16:23
This really makes me think about `UPNP` or `Universal Plug And Play`, http://fr.wikipedia.org/wiki/Universal_Plug_and_Play — Matthieu M., Mar 03 '10 at 07:21

jschmier · Accepted Answer · 2010-03-04T16:36:33.740

How many times will I have to call recvfrom()? When will I come to know that I have handled the response from all the devices/servers?

If you don't know the number of devices/servers, you cannot know how many times you will need to call recvfrom() or when you've handled all the responses.

You might consider using a select() loop (until timeout) and call recvfrom() when data is available to read. This might be in the main thread or a separate thread.

If the data arrives faster than it can be processed, you will lose datagrams. This will depend largely on the speed that the data is parsed and stored after it is received. If processing the data is an intensive operation, it may be necessary to do the processing in a separate thread or store the data until the receive loop times out and then proceed with processing it.

Since UDP is unreliable, looping to rebroadcast a few times should help account for some of the loss and the processing should account for duplicates.

The following pseudocode is how I might approach the problem:


/* get socket to receive responses */
sd = socket( ... );

do
{
    /* set receive timeout */
    timeout.tv_sec = 5;     

    /* broadcast request */
    sendto( ... );

    /* wait for responses (or timeout) */
    while(select(sd+1, &readfds, NULL, NULL, &timeout) > 0)
    {
        /* receive the response */
        recvfrom( ... );

        /* process the response (or queue for another thread / later processing) */
        ...

        /* reset receive timeout */
        timeout.tv_sec = 5; 
    }

    /* process any response queued for later (and not another thread) */

} while (necessary);

Or in general, is recvfrom() the right choice for receiving response from multiple servers?

recvfrom() is commonly used with connectionless-mode sockets because it permits the application to retrieve the source address of received data.

If all the servers would respond at the same time, how would that be handled? — puffadder, Mar 02 '10 at 16:17
Thanks for the response. But do I really need a thread to process each response? Is there any other way out? — puffadder, Mar 02 '10 at 16:25
The responses contain IP address and device ID of the responders, among other things. I want to parse the responses and store the required data. — puffadder, Mar 02 '10 at 16:30
IF all the servers respond at the same time, their packets will be queued inside some of your switches and spewed out serially. Those switches might have a bit too much to handle and could simply drop the packets - UDP is unreliable after all, even on a LAN. Should you be on an ancient lan, or a lan connected by hubs instead of switches, ethernet collision detection would prevent several nodes to send at the exact same time. — nos, Mar 02 '10 at 17:17

score 2 · Answer 2 · answered Mar 02 '10 at 16:49

Use a select(2)/poll(2) with a timeout in a loop, decrementing the timeout every time you get a response from a device. You'd have to come up with appropriate timeout yourself.

Alternatively, if you are able to recognize/parse the discovery response message, just add the device to the list upon receiving such message.

You will probably have to deal with timeouts anyway for when devices register but fail at a later point.

score 2 · Answer 3 · edited Mar 02 '10 at 23:50

If you don't know how many servers are going to respond, then you don't know how many times you have to call recvfrom(). I would probably handle this with a select() loop with a suitable timeout, something like the following, which is completely untested and probably full of stupid bugs:

/* create and bind socket */

fd_set fds;
struct timeval tv;

tv.tv_sec = 2; 
tv.tv_usec = 0;
FD_ZERO(&fds);
FD_SET(sock, &fds);
int ret;

while((ret = select(sock + 1, &fds, NULL, NULL, &tv)) > 0) {
    char buf[BUFLEN];
    struct sockaddr addr;

    if(recvfrom(sock, buf, BUFLEN, MSG_DONTWAIT, &addr, sizeof(struct sockaddr)) > 0) {
        /* handle response */
    } else {
        /* handle error */
    }        
}
if(ret < 0) {
    /* handle error */
} else {
    /* select() timed out; we're theoretically done */
}

This will keep calling recvfrom() until no reply has been received for 2 seconds, which of course means that it will block for a minimum of 2 seconds. Depending on the underlying protocol, you can probably get away with a much shorter timeout; indeed you can decrease it on each response. Some testing and tuning will be required to find the optimum configuration. You don't need to worry about servers responding at the same time; the Ethernet layer will handle that.

Do I need to necessarily "bind" to use select function? Currently I am not binding the socket... — puffadder, Mar 02 '10 at 17:04
@pufadder - No, you do not need to `bind`. `select()` takes a set of file descriptors, and you will receive a file descriptor to add to the set when you create the socket. — jschmier, Mar 02 '10 at 17:10
Even if you do know how many servers were going to respond, you don't know how many messages you are going to get. Some of the packets might be dropped along the way. — nos, Mar 02 '10 at 17:25
You should also add an outer loop, to repeat the entire "send probe and wait for responses" process two or three times, to be robust in the face of lost packets. (You'll also have to be prepared to ignore duplicate responses - but you could theoretically see those even with just one probe anyway). — caf, Mar 02 '10 at 23:52

score 0 · Answer 4 · answered Mar 02 '10 at 19:14

You can't know. Its unknowable.

Presumably, from your description: I want to find out all the alive devices, the devices can transition from dead to alive and back again any time they want to. This means that you will have to poll continuously: ie send the broadcast request every few seconds (but not too often) and see who responds.

If I've got this right, that UDP is inherently unreliable, you are going to have to retro-fit some reliability on top of UDP:

Send the broadcast out every few seconds, since devices may not receive it every time.
You may not receive their replies, but you might the next time.
A reply confirms that a device is alive.
Wait for 'n' non-responses before declaring a device dead.

Do you know the maximum number of possible devices? If you do you may find that you have to call recvfrom() that many times.

Receiving response(s) from N number of clients in reply to a broadcast request over UDP

4 Answers4

Linked