1

In order to make it possible for the responder to distinguish duplicate packets from out of order packets, a given send queue shall have a series of PSNs no greater than 8,388,608 outstanding at any given time. Therefore, a send queue shall have no more than 8,388,608 packets outstanding at any given time. This includes the sum of all SEND request packets plus all RDMA WRITE request packets plus all ATOMIC Operation request packets plus all expected RDMA READ response packets. Thus, the PSN space (consisting of a range of 16,777,216 PSNs) is divided into two regions, each occupying a range of 8,388,608 PSNs, called the valid region and the invalid region.

As I quoted from IBTA spec, why is it not possible to distinguish duplicate from the out-of-order packets if the valid region is bigger than half the size of the 2^24-sized PSN region?

enter image description here

Barney_su
  • 43
  • 6

1 Answers1

2

Imagine that the total PSN range was smaller, to simplify the example, say 0..3. The valid region would be 2 packets if we follow the spirit of the spec, which would include the expected PSN and 1 previous duplicate PSN, but let's say we increase it to 3 packets.

Take a look at the two scenarios below:

Out of order scenario

Sender sends  | Receiver sends
Send 0        | Ack 0
Send 1 (lost) |
Send 2 (lost) |
Send 3        | ?

After the receiver receives Send 0, the expected PSN is 1. When the receiver gets the 4th packet it is an out-of-order packet, more advanced than the expected PSN by 2. The responder should treat this as a sequence error.

Duplicate scenario

Sender sends     | Receiver receives | Receiver sends
Send 3           | Send 3            | Ack 3 (lost)
Send 3 (delayed) |                   |
Send 0           | Send 0            | Ack 0
                 | Send 3 (delayed)  | ?

Here the sender retransmits Send 3 after it times-out waiting on the lost ack. The retransmission is delayed in the network, and the receiver sees it only after it receives Send 0. The expected PSN on the receiver is 1, and it is receiving a packet within the valid region (2 packets behind), so it should treat it as a duplicate packet.

Summary

As you can see, in both scenarios the receiver state (expected PSN) is the same, and the received packet has the same PSN, so with a valid region of 3, it wouldn't be able to distinguish between the two scenarios. If we limit the valid region to 2, the first scenario wouldn't be possible, as the sender would have to wait for an acknowledgement for PSN 1 before sending PSN 3.

haggai_e
  • 4,689
  • 1
  • 24
  • 37
  • This explanation makes sense. So imo this scheme is adopted by the sender side by "Not sending requests if there are already say 2 outstanding requests in the above example", and the receiver side would believe that the packet is duplicated if the received PSN is in the range of (expected PSN - 2, expected PSN]. If this is the case, what if a packet takes forever and finally arrived as a stale, ghost packet stuck somewhere, is it possible to recognize those packet only using PSN? – Barney_su Nov 03 '20 at 06:14
  • If a packet can be delayed for an arbitrarily long time, I think you could devise a scenario where such a packet is mistakingly considered an out of order packet instead of a duplicate, or even as the expected packet. I don't think the IB transport has a solution for that. – haggai_e Nov 03 '20 at 06:59
  • yea I'd agree with you on that. Thanks for clarifying the question. – Barney_su Nov 03 '20 at 08:33
  • By the way, I think the IPSec anti-replay protocol solves this issue by forbidding wraparound of PSNs. Basically, you need to establish a new security association every 2^32 packets. – haggai_e Nov 03 '20 at 10:34
  • Good examples to show the ambiguity if the valid region of PSN is not limited, but I've got two more questions. 1) why is the valid region exactly half of 2^24? 2) TCP sequence number also has wrap around, why does not TCP need this mechanism? – Bloodmoon Jun 12 '23 at 14:57