0

In HFT trading application I need to receive data from udp multicast socket. The only requirement is latency - this is so important that I can "spent" one CPU core. It's ok to spin or whatever. This is what I currently have in Windows:

void Receiver::ThreadMethod() {
    //UINT32 seq;
    sockaddr_in Sender;
    int SenderAddrSize = sizeof(Sender);

    while (stayConnected) {
        int res=recvfrom(socketId,buf,sizeof(char) * RECEIVE_BUFFER_SIZE,0, (SOCKADDR *)& Sender, &SenderAddrSize);
        if (res == SOCKET_ERROR) {
            printf("recvfrom failed, WSAGetLastError: %d\n", WSAGetLastError());
            continue;
        }
        //seq = *(UINT32*)buf;
        //printf("%12s:seq=%6d:len=%4d\n", inet_ntoa(Sender.sin_addr), seq, res);
        unsigned char* buf2 = reinterpret_cast<unsigned char*>(buf);
        feed->ProcessMessage(res, buf2);
    }
}

recvfrom blocks, so it will be likely very slow (or i'm wrong?). I should rewrite this for Linux and achieve the best latency. I need to process just ONE socket per thread, so I assume I should NOT use epoll as it designed more to process many sockets. What should I use?

upd i've found similar question Low-latency read of UDP port

Community
  • 1
  • 1
Oleg Vazhnev
  • 23,239
  • 54
  • 171
  • 305
  • 1
    It's not clear why you think a blocking recvfrom() call will be very slow. It's true it won't return for a long time if no packets are received, but if/when a packet is received it should return right away. Is it context-switching overhead that you're worried about? – Jeremy Friesner Sep 18 '14 at 19:50
  • Btw if you want guaranteed low latency, you might look at Xenomai real time extensions for Linux, as that is what they provide. – Jeremy Friesner Sep 18 '14 at 19:51
  • @JeremyFriesner blocking is always expensive, that's why people "spin" – Oleg Vazhnev Sep 18 '14 at 20:00
  • 1
    "blocking is always expensive" <-- can you explain why, or is that an article of faith? – Jeremy Friesner Sep 18 '14 at 21:33
  • @JeremyFriesner i think it's because when you block, you spent time to "wake-up", when you don't block you save this time ) – Oleg Vazhnev Sep 19 '14 at 05:59

1 Answers1

0

In UNIX, you should use fcntl to set your socket non blocking :

fcntl(socket, F_SETFL, O_NONBLOCK);

Additionally, if you client needs to process multiple sockets (e.g. to aggregate multiple feeds), you should use a select call to process multiple file descriptors at once, and see which socket has available data, if any (this will, among other things, avoid to loop through all the sockets for nothing)

As for the latency, other factors such as the NIC type and configuration, the kernel settings (possibly having a NIC that bypass the kernel) will have considerable impacts on the latency (to be measured).

quantdev
  • 23,517
  • 5
  • 55
  • 88
  • as far as I know in low latency it's better to use "one thread - one socket". also note I receive SAME data on both sockets. using TWO threads it will be possible to receive TWO packets in parallel, which should improve latency a little bit. – Oleg Vazhnev Sep 18 '14 at 20:02
  • NIC and everything else is brilliant configured and fastest on the planet, only software not written yet. – Oleg Vazhnev Sep 18 '14 at 20:03
  • @javapowered no, it's not "one thread one socket", most handlers spin select on one or two CPU, but it all depends on your available hardware. (multiple treads would get concurrent access to the NIC anyway + context switch costs to run them all). Hop this helps. – quantdev Sep 18 '14 at 20:14