On my office server, the output of cat /proc/interrupts
displays one tx queue and 5 rx queues. Is this due to NIC hardware capabilities or can you configure multiple receive ring buffers in linux per NIC? Also wouldnt 5 rx queues, increase latency as tcp has to poll 5 queues to get answer, even though filling of the 5 queues is being done by 5 different CPU as all 5 rx queues have different CPU affinity.
I would like to point that the NIC supports 10GE, but the network only supports 1Gb bandwidth.

- 303
- 1
- 4
- 11
1 Answers
The use of multiple queues provides a performance improvement by allowing multiple cores to poll the network card at the same time efficiently.
Also wouldnt 5 rx queues, increase latency as tcp has to poll 5 queues to get answer, even though filling of the 5 queues is being done by 5 different CPU as all 5 rx queues have different CPU affinity.
Latency will be much better because each packet will, on average, only have 1/5th as many packets ahead of it in the queue.
If you only had a single queue, you could only effectively pull packets from it with a single core. If that core went on to process the packet, that would cause horrible bottlenecks because nothing would be pulling the next packet from the NIC. If you hand the packet off, you again have horrible bottlenecks because handing data from one core to another causes lots of cache misses. The most efficient setup is to have packets stream in from multiple queues to multiple cores which then dispatch those packets in parallel.

- 31,449
- 2
- 55
- 84
-
I am assuming the 5 queues is a NIC specific feature as driver would have to be configured to publish to those 5 queues? Would such a driver be multi-threaded with one thread per queue? Also how does tcp learn to poll one queue vs 5 queues? – Jimm Jan 07 '13 at 06:37
-
The queues are a NIC feature. The driver will run in multiple cores. TCP doesn't poll queues, the NIC driver polls the queues and passes the received packets to the TCP layer. The TCP layer in modern operating systems is able to run concurrently on multiple cores. The idea is to eliminate the single source bottleneck. If you only have one source, you have to "hand things" off to other cores, and the source becomes the bottleneck and the handoff breaks cache affinity. – David Schwartz Jan 07 '13 at 06:41
-
Please correct if i am wrong, but my understanding is that TCP can poll queues. Consider a situation where socket's receive buffer is full. At that time, the driver will only publish data to the rx queues. When user empties the receive buffer and wants to read more data, the read system call can go all the way to rx buffers to receive it. – Jimm Jan 07 '13 at 06:53
-
No, not at all. That would be madness. Space in the NIC receive buffers is precious -- wasting it like that would be nuts. The packets flow upwards. The NIC driver polls the NIC and hands packets off to higher layers as appropriate. – David Schwartz Jan 07 '13 at 06:58