0

When measuring UDP throughput between Windows PC and Zynq-based device by iperf2 tool, I am getting around 950 Mb/s over dedicated 1Gb Ethernet link. However, when using my own UDP application on PC I am getting only around 50 Mb/s, which is drastically lower throughput than measured by iperf. Of course, in my UDP application I don't have any processing, only while loop in which I am calling sendto function, with UDP packets of sizes of 1470 bytes. Application on Zynq device is provided by XAPP1026, so it's not mine. I am looking at iperf code trying to figure out what they do differently, but basically, I can't find any socket or udp options or anything similar they do in order to maximize UDP throughput.

Here is the code of the main function (MAXUDP define is 1470):

int main(int argc, char** argv) 
{
int sockfd;
struct sockaddr_in servaddr;
char sendline[MAXUDP];
int i;
int j;
const int tr_size = ( 200 * MB );
const int npackets = ( tr_size / MAXUDP );
const int neval = 2;
DWORD start;
DWORD end;
int optval;

WSADATA wsaData;
if(WSAStartup(MAKEWORD(2, 1), &wsaData) != 0 )
{
    printf( "Err: %d\n", WSAGetLastError() );
    exit(1);
}

bzero(&servaddr, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_port = htons(SERV_PORT);
servaddr.sin_addr.s_addr = inet_addr("172.16.0.215");

sockfd = Socket(AF_INET, SOCK_DGRAM, 0);
Connect(sockfd, (const SA*) &servaddr, sizeof(servaddr));

optval = 208*KB;
Setsockopt(sockfd, SOL_SOCKET, SO_SNDBUF, (const char*) &optval, sizeof optval);

prep_data(sendline, MAXUDP);

for ( i = 1; i <= neval; i++ )
{
    start = GetTickCount();
    for ( j = 0; j < npackets/neval; j++ )
        sendto(sockfd, sendline, MAXUDP, 0, NULL, NULL);
    end = GetTickCount() - start;

    printf("Time elapsed: %d sec.\n", end/1000);
    printf("Throughput: %d.%3d MB/s\n", (tr_size/neval)/end/1000, (tr_size/neval)/end - (tr_size/neval)/end/1000);
}
return 0;
}

So, my main question is how to maximize UDP throughput in the same way iperf does it?

UPDATE: I switched to Ubuntu PC. Results are different, but still there is some random stuff happening. The first thing I do is set IP addresses for eth0 (ifconfig eth0 172.16.0.200 netmask 255.255.255.0) and gateway address (route add default gw 172.16.0.1). When I run iperf with iperf -c 172.16.0.215 -i 5 -t 25 -u -b 1000m) I am getting around 800 Mbits/sec. However, after few runs of iperf in same way, all of sudden I start getting only around 15 Mbits/sec or even much less. I figured out that I need to set IP, netmask and gateway addresses once again in order to get 800 Mbits/sec. Also, my UDP application behaves in the same way. I am measuring 957 Mbits/sec (with MAXUDP set to 1470) after I had run commands for setting IP addresses. But after few iterations it slows down to around 11 Mbits/sec. Then I set IP addresses again, and the behavior repeats itself. So, as Kariem stated in his answer, the problem is not in the code itself, bur rather in some OS, netif configuration related stuff. However, I must run my UDP application on Windows, so I need to figure out what is happening there. If you guys have any ideas on what could be happening in Windows, please let me now.

Irie
  • 1
  • 3
  • Post your code please. It will help. – Kariem Apr 27 '17 at 12:43
  • Are you saying that you send UDP datagrams with a 1470-byte *payload*, or that your packets are *overall* 1470 bytes in size? (Note: this is an example of the kind of question we would not need to pose if you provided a [mcve], as is our usual expectation.) – John Bollinger Apr 27 '17 at 12:51
  • I have added the code, so things will be more clear now. Also, I have just figured out when I put MAXUDP to be not 1470, but rather 14700 (well, I am just experimenting), I am getting up to 550 Mb/s. Confusion is getting higher as well. – Irie Apr 27 '17 at 13:03
  • @Irie: Check the return value from `sendto()`! That's how you can figure out how many bytes were actually sent, not discarded. `send()` doesn't always send all the bytes you asked to send, so you're probably measuring the wrong thing when you set large buffer sizes. – John Zwinck Apr 27 '17 at 13:17
  • True, @JohnZwinck, but if fewer data were in fact being sent than the OP supposes then he would be getting an overestimate of throughput. That does not explain his observation; instead it raises the possibility that the actual performance may be even worse than he thinks. – John Bollinger Apr 27 '17 at 13:20
  • 1
    Though I guess maybe that explains the apparent throughput increase when MAXUDP is increased to 14700. – John Bollinger Apr 27 '17 at 13:22
  • It's UDP, so it either sends all the bytes requested in the datagram, or none? – ThingyWotsit Apr 27 '17 at 13:28
  • I added part of code which checks the return value from `sendto()` (it never returns value smaller than MAXUDP). All in all, results are same. – Irie Apr 27 '17 at 13:30
  • Obviously, larger values of `MAXUDP` reduce the overall number of `sendto()` calls. If `sendto()` always reports successfully sending `MAXUDP` bytes for every value of `MAXUDP` so far tested, then it is worthwhile to increase `MAXUDP` further to reduce the number of calls, because syscalls are comparatively expensive. The absolute maximum payload size supported by the UDP protocol is 65507 bytes. – John Bollinger Apr 27 '17 at 13:36
  • It makes sense. I put `MAXUDP` to 65507 and I am getting up to 695 Mb/s. On the other hand, I noticed that iperf is also calling `sendto()` with 1470, and it gets even higher results. – Irie Apr 27 '17 at 13:42
  • Unless you are also changing the MTU on your interface, it is unlikely that you are really sending 14000 byte packets. Use wireshark to see where the gaps are in your protocol. – stark Apr 27 '17 at 13:50
  • I am using Wireshark, and it doesn't send 14000 bytes packets. It divides (fragments) those packets in many IP fragments. – Irie Apr 27 '17 at 13:59
  • Yes - UDP datagrams can be fragmented. It's normal for IP. – ThingyWotsit Apr 27 '17 at 20:48

2 Answers2

0

You have a mistake in the way you're calculating throughput. Your size is set in bytes, so you're calculating your throughput in bytes while iperf does it in bits.

change

printf("Throughput: %d.%3d MB/s\n", (tr_size/neval)/end/1000, (tr_size/neval)/end - (tr_size/neval)/end/1000);

to this

printf("Throughput: %d.%3d MB/s\n", ((tr_size/neval)/end/1000)*8, (tr_size/neval)/end - ((tr_size/neval)/end/1000)*8);

I ran a version of your code in my machine and I'm getting 1GB throughput. Here it is

#include <netinet/in.h>
#include <string.h>
#include <sys/time.h>
#include <cstdio>
#include <arpa/inet.h>
#include <fcntl.h>

#define MAXUDP 1470
#define SERV_PORT 5001
static inline long int getCurTimeInMs()
{
    struct timeval tp;
    gettimeofday(&tp, NULL);
    return tp.tv_sec * 1000 + tp.tv_usec / 1000;

}


int main(int argc, char** argv)
{
    int sockfd;
    struct sockaddr_in servaddr;
    char sendline[MAXUDP];
    int i;
    int j;
    const int tr_size = ( 10000 * 1024*1024 );
    const int npackets = ( tr_size / MAXUDP );
    const int neval = 2;

    int optval;

    bzero(&servaddr, sizeof(servaddr));
    servaddr.sin_family = AF_INET;
    servaddr.sin_port = htons(SERV_PORT);

        servaddr.sin_addr.s_addr = inet_addr("10.0.1.2");

    sockfd = socket(AF_INET, SOCK_DGRAM, 0);


    connect(sockfd, (const sockaddr*) &servaddr, sizeof(servaddr));


    optval = 208*1024;
    setsockopt(sockfd, SOL_SOCKET, SO_SNDBUF, (const char*) &optval, sizeof optval);
    long int start = 0, end = 0;
    for ( i = 1; i <= neval; i++ )
    {
        start = getCurTimeInMs();
        for ( j = 0; j < npackets/neval; j++ )
            sendto(sockfd, sendline, MAXUDP, 0, NULL, NULL);
        end = getCurTimeInMs() - start;

        printf("Time elapsed: %d sec.\n", end/1000);
        printf("Throughput: %d.%3d MB/s\n", (tr_size/neval)/end/1000 * 8, (tr_size/neval)/end - (tr_size/neval)/end/1000);

    }
    return 0;
}
Kariem
  • 750
  • 5
  • 13
  • I wouldn't say there is a mistake there. For example, when I am transferring 100 MB in total, I get time elapsed around 15 seconds, while throughput is around 6.6 MB/s, which makes perfect sense. Moreover, when looking at stats in Wireshark, I need around 16 seconds to transfer 71250 UDP packets, each size of 1514 bytes (71250 * 1514 = 107872500 (~100MB)). 71250 * 1514 / 16 is around 6,7 MB/s. – Irie Apr 27 '17 at 14:54
  • Yes the numbers are right if you're calculating your throughout in bytes per second, but that's not what iperf does. iperf computes throughput in bits per second, so it is going to be 8 times your calculated throughput. MBPS stands for mega bits per second. – Kariem Apr 27 '17 at 14:57
  • Yes, so that means I am getting in this example 6.6 * 1024 * 1024 * 8 Mb/s, which is 55 Mb/s. That's what I said in original post. This result is far away from 950 Mb/s iperf is measuring. – Irie Apr 27 '17 at 15:04
  • I went by your code, so if you're manually multiplying by 8 at the end to get throughput in bits per second, then fair enough. As I stated in the answer I tested your code and I got 1 GBPS which is the capacity of my Ethernet cable, so clearly you're code is not problematic. The only other rationale would be that your code is not running frequently enough for some reason. Try running it with higher priority and try setting your socket to non-blocking. – Kariem Apr 27 '17 at 15:12
  • Yes, I was manually doing it. Thanks for trying the code out! I guess you was measuring throughput between two PCs or the other side was running on some other device? I will try to set socket to non-blocking, but I don't know how to make it run with higher priority. If you have any references about this, I would be grateful if you could share. – Irie Apr 27 '17 at 15:19
  • Yea, was testing between a desktop running Ubuntu and communicating with a tablet running Android. You can change the priority in Linux by using renice -n -20 -p – Kariem Apr 27 '17 at 17:48
0

Why did you define MAXUDP 1470? Try setting it to 65535, measure again and report here.

You should not be confusing ethernet frame size (1500 bytes) with UDP datagram size. Different things. Let the IP stack do the necessary fragmentation instead of your application. That may be more efficient.

Sokre
  • 114
  • 6
  • I did that as well. On Windows I am getting up to 695 Mbits/sec in that case. However, iperf on Windows is sending 1470 bytes datagrams and is getting more than 900 Mbits/sec. I don't know how to explain that. – Irie Apr 28 '17 at 11:56
  • When running my app on Ubuntu I am getting 955 Mbits/sec even with `MAXUDP` set to 1470. – Irie Apr 28 '17 at 12:02
  • One more thing. The part of your code where you are measuring elapsed time (for loop with sendto() in it) measures only how quickly your app is handing over the data to the socket/IP stack. There is no feedback from the "other side" when using UDP. You can just as well disconnect the cable. – Sokre Apr 28 '17 at 12:27
  • Also, I see that you are casting optval to (const char *) in setsockopt call and declaring it as int. [This page](https://msdn.microsoft.com/en-us/library/windows/desktop/ms740532(v=vs.85).aspx) says expected data type is DWORD. – Sokre Apr 28 '17 at 12:36
  • Yes, you are right, even when I disconnect the cable, I am getting same results. However, on the server side (Zynq device) I am measuring how many data is received, so that's my feedback from the "other side". That line of code with setsockopt doesn't change anything. Nevertheless, I also put getsockopt after that line and I am getting expected value. – Irie Apr 28 '17 at 12:54