1

I am working with DPDK version 18.11.8 stable on Linux with an Intel X722 NIC.

My app works fine if I calculate IP and UDP checksums in software but I get a segmentation fault if I calculate in hardware. Here is my code:

local_port_conf.txmode.offloads  = local_port_conf.txmode.offloads | DEV_TX_OFFLOAD_IPV4_CKSUM  | DEV_TX_OFFLOAD_UDP_CKSUM;
mb->ol_flags |= PKT_TX_IPV4 | PKT_TX_IP_CKSUM | PKT_TX_UDP_CKSUM; 
mb->l2_len = sizeof(struct ether_hdr);
mb->l3_len = sizeof(struct ipv4_hdr);
mb->l4_len = sizeof(struct udp_hdr);        
p_ip_hdr->hdr_checksum = 0;
p_udp_hdr->dgram_cksum = rte_ipv4_phdr_cksum((const ipv4_hdr*)(mb->l3_len), mb->ol_flags);

The rte_ipv4_phdr_cksum() call is mysterious, have I understood what to do correctly?

Understandably, the C++ compiler gaves a warning:

warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
         p_udp_hdr->dgram_cksum = rte_ipv4_phdr_cksum((const ipv4_hdr*)(ptMbuf->l3_len), ptMbuf->ol_flags);
                                                                                      ^

What is wrong with my code?

DavidA
  • 2,053
  • 6
  • 30
  • 54
  • Hi @DavidA if you are requesting HW for udp checksum offload, then you should be `dgram_cksum = 0` and not calculate raw checksum with `rte_ipv4_phdr_cksum`. Please note, as suggested in other ticket, one should check if feature is available by `(dev_info.tx_offload_capa & DEV_TX_OFFLOAD_UDP_CKSUM)`. Can you please edit your code and update the ticket with the result. – Vipin Varghese Jul 06 '20 at 12:23
  • @VipinVarghese, thanks but https://doc.dpdk.org/guides/prog_guide/mbuf_lib.html says: "set out_udp checksum to pseudo header using rte_ipv4_phdr_cksum()" – DavidA Jul 06 '20 at 12:40
  • did you try setting udp_cksum to 0? I remember CKSUM offload for I40E is partial, hence requested for checking by (dev_info.tx_offload_capa & DEV_TX_OFFLOAD_UDP_CKSUM. waiting for your updates. `http://doc.dpdk.org/guides/nics/overview.html` Partial. Also I am not sure if we really should use `rte_ipv4_phdr_cksum` for UDP checksum? – Vipin Varghese Jul 06 '20 at 13:03
  • @VipinVarghese I have checked the capability as you suggested, the device does support DEV_TX_OFFLOAD_IPV4_CKSUM and DEV_TX_OFFLOAD_UDP_CKSUM. Setting udp_cksum to 0 fixes the seg fault, but gives an incorrect udp checksum. – DavidA Jul 06 '20 at 13:52
  • so the device supports UDP checksum, is good news, putting udp_cksum to 0 we made sure the device is calculating the checksum this is also good. Now pass the psuedo checksum value as `rte_ipv4_phdr_cksum([offset of start of L4], mb->ol_flags);` where L4 offset is `mtod(mbuf) + m->l2_len`. in current sample code you have put `(mb->l3_len)` which I believe is not correct – Vipin Varghese Jul 06 '20 at 14:29
  • @VipinVarghese Thanks so I now have: struct ipv4_hdr* p_ipv4_hdr = rte_pktmbuf_mtod(mb, struct ipv4_hdr *) + mb->l2_len; p_udp_hdr->dgram_cksum = rte_ipv4_phdr_cksum(p_ipv4_hdr, mb->ol_flags); but the udp checksum is still incorrect. – DavidA Jul 06 '20 at 15:00
  • Hi David, I believe you have not st the right values. Will update in couple of hours – Vipin Varghese Jul 06 '20 at 16:04
  • @VipinVarghese Thank you, I am grateful for your help. – DavidA Jul 06 '20 at 16:07
  • Hi David we can take this in dpdk-debug. meanwhile please try `get pointer to ethernet, ipv4 and udp, update the IP and UDP fields. for /* SW check sum */ p_ip_hdr->hdr_checksum = 0; p_udp_hdr->dgram_cksum = 0; p_udp_hdr->dgram_cksum = rte_ipv4_udptcp_cksum(p_ip_hdr, p_udp_hdr); /* HW check sum */ p_ip_hdr->hdr_checksum = 0; p_udp_hdr->dgram_cksum = rte_ipv4_phdr_cksum(p_ipv4_hdr, ol_flags);` – Vipin Varghese Jul 06 '20 at 17:35
  • @VipinVarghese HW udp checksum calculation is working correctly with p_udp_hdr->dgram_cksum = rte_ipv4_phdr_cksum(p_ipv4_hdr, ap_mbuf->ol_flags); Thank you. – DavidA Jul 07 '20 at 13:34

2 Answers2

6

Following are the steps required for HW checksum offload for IP and UDP in DPDK:

  1. Check whether hardware supports HW checksum offload.
  2. Prepare the packet IP and UDP header fields.
  3. Initialize the checksum fields. In case of partial UDP checksum offload support, initialize the UDP checksum with the checksum of the IP pseudo-header.

Note: The Intel NIC X553, X710 & X722 (and probably others), only support partial UDP checksum offload,

[EDIT based on the query in the comment from @maxschlepzig] Are there NICs which support full UDP checksum offloading? If yes, is there some DPDK way to check for full support?

[Answer] Yes, there is an easy way to check. In NIC features TABLE 1.1 highlights if checksum offload is available, partial or absent. But the real catch is to check release notes for NIC specific firmware as the HW ASIC or fixed functions are enabled or disabled for the same.

which requires a snippet like this:

/* during port configuration */
txmode = {
    .offloads = DEV_TX_OFFLOAD_IPV4_CKSUM | DEV_TX_OFFLOAD_UDP_CKSUM
};

/* during device configuration */
if (!(dev_info.tx_offload_capa & DEV_TX_OFFLOAD_UDP_CKSUM) || !(dev_info.tx_offload_capa & DEV_TX_OFFLOAD_IPV4_CKSUM)) {
    rte_panic(" offload not supported");
}

/* mb is mbuf packet to transmit */
mb->ol_flags = PKT_TX_IPV4 | PKT_TX_IP_CKSUM | PKT_TX_UDP_CKSUM;
mb->l2_len = sizeof(struct ether_hdr);
mb->l3_len = sizeof(struct ipv4_hdr);      

struct rte_ether_hdr *eth_hdr = rte_pktmbuf_mtod(mb, struct rte_ether_hdr *);
struct ipv4_hdr *p_ipv4_hdr = (struct ipv4_hdr*) ((char *)eth_hdr + sizeof(struct ether_hdr));
struct udp_hdr *p_udp_hdr = (struct udp_hdr *)((unsigned char *) + sizeof(struct ipv4_hdr));

/* update v4 header fields with version, length, ttl, ip and others */
/* update udp headers */

/* in case hardware offload is unavailable */ 
p_udp_hdr->dgram_cksum = rte_ipv4_udptcp_cksum(p_ip_hdr, p_udp_hdr);
p_ip_hdr->hdr_checksum = rte_ipv4_cksum(p_ip_hdr)

/* otherwise, equivalent offloaded checksum computation */ 
p_udp_hdr->dgram_cksum = rte_ipv4_phdr_cksum(p_ipv4_hdr, ol_flags);
p_ip_hdr->hdr_checksum = 0; 

[EDIT-2] thanks to @maxschlepzig in pointing out l4_len is not required for UDP Checksum offload. This is the right information l4_len is only required for TCP Segment offload for TX.

Based on the live debug with @DavidA, the segfault reason were identified to incorrect usage and in the comments, the code is updated as

@VipinVarghese Thanks so I now have: struct ipv4_hdr* p_ipv4_hdr = rte_pktmbuf_mtod(mb, struct ipv4_hdr *) + mb->l2_len; p_udp_hdr->dgram_cksum = rte_ipv4_phdr_cksum(p_ipv4_hdr, mb->ol_flags); but the udp checksum is still incorrect.

Hi David, I believe you have not st the right values. Will update in couple of hours – Vipin Varghese

one of the action items that was shared with David was updating the pseudo code as the real question revolved around DPDk: HW offloaded calculation of UDP checksum not working. Thanks, @maxschlepzig for explaining it all, as comments and live debug can be easily missed out by many.

Vipin Varghese
  • 4,540
  • 2
  • 9
  • 25
  • hi David thanks for confirming, I have updated the answer please accept and upvote – Vipin Varghese Jul 08 '20 at 01:53
  • Hi Vipin, thanks for your answer. I think in the case of SW checksum, it should be: p_ip_hdr->hdr_checksum = rte_ipv4_cksum( p_ip_hdr ); Do you agree? – DavidA Jul 08 '20 at 08:27
  • Since upgrading from DPDK 18.11.9 to 19.11.8, UDP checksum offloading has stopped working in our application. Are you aware of any changes that may have caused this? – DavidA Apr 26 '21 at 16:53
  • It seems that the `l4_len` doesn't have to be set. At least with the Intel X553 the hardware offloading of IP/(partial) UDP checksum seems to work without it, at least wireshark validates the resulting checksums. – maxschlepzig Sep 09 '21 at 21:09
  • Are there NICs which support full UDP checksum offloading? If yes, is there some DPDK way to check for full support? I mean, since the `DEV_TX_OFFLOAD_UDP_CKSUM` capability is already set for NICs that 'only' support partial UDP checksum offloading. – maxschlepzig Sep 09 '21 at 21:14
  • Hi @VipinVarghese, if we have VLAN with IPv4 header what are the flags we need to pass for ol_flags? I tried with `PKT_TX_VLAN` in m->ol_flags and `DEV_TX_OFFLOAD_VLAN_INSERT` in port configuration but it is not working for me – Adarsha Verma Dec 24 '21 at 07:07
  • @AdarshaVerma thanks for the question, I have to mention I am not clear with the same. Can you explain it more clearly? – Vipin Varghese Dec 28 '21 at 09:44
  • I want to add my comments about i40e driver. this driver Partially support checksum. For example; if packet is fragmented then UDP checksum is wrong. so it needs to fragmented by sw , not hw! . For incoming packets, after assembling packets , it needs to re-calculate ip header checksum. – Yasin Caner Mar 11 '22 at 10:40
1

The function rte_ipv4_phdr_cksum() is documented as:

/**
 * Process the pseudo-header checksum of an IPv4 header.
 *
 * The checksum field must be set to 0 by the caller.
 *
 * Depending on the ol_flags, the pseudo-header checksum expected by the
 * drivers is not the same. For instance, when TSO is enabled, the IP
 * payload length must not be included in the packet.
 *
 * When ol_flags is 0, it computes the standard pseudo-header checksum.
 *
 * @param ipv4_hdr
 *   The pointer to the contiguous IPv4 header.
 * @param ol_flags
 *   The ol_flags of the associated mbuf.
 * @return
 *   The non-complemented checksum to set in the L4 header.
 */
static inline uint16_t
rte_ipv4_phdr_cksum(const struct rte_ipv4_hdr *ipv4_hdr, uint64_t ol_flags)

So when you cast like this

p_udp_hdr->dgram_cksum = rte_ipv4_phdr_cksum((const ipv4_hdr*)(mb->l3_len),
        mb->ol_flags);

the ipv4_hdr parameter contains an nonsensical address. Thus, this can't work. And this is the root cause for your segmentation fault (-> an invalid address is dereferenced).

For computing the checksum of the pseudo-header you have to supply your original IP header, thus, just call it like this:

p_ip_hdr->hdr_checksum = 0;
p_udp_hdr->dgram_cksum = rte_ipv4_phdr_cksum(p_ip_hdr, mb->ol_flags);

The rte_ipv4_phdr_cksum() then internally builds a temporary pseudo header (based on the supplied p_ip_hdr) in order to compute its checksum.

NB: You don't have to assign to mb->l4_len for UDP checksum offloading. Setting l2_len and l3_len is sufficient.

maxschlepzig
  • 35,645
  • 14
  • 145
  • 182
  • 1
    Best answer so far. So concise and to the point. I like how you stressed out clearly the reason behind the segfault. I also noticed that you wondered whether one could tell full checksum offload support from SW-assisted (partial) one. At the moment, no way to do that. Yes, `DEV_TX_OFFLOAD_UDP_CKSUM` can mean both variants. But the truth is that, theoretically, applications should not prepare pseudo-header checksum and similar stuff themselves. It's the duty of `rte_eth_tx_prepare()`. DPDK API contract insists that the user first invoke `rte_eth_tx_prepare()`, then `rte_eth_tx_burst()`. – user10304260 Sep 11 '21 at 20:33
  • PMDs which need partial checksum preparations to be done, already have the corresponding code in their implementations of `tx_pkt_prepare` callback. This applies both to regular Tx operation and TSO. – user10304260 Sep 11 '21 at 20:33
  • @stackinside Good point regarding `rte_eth_tx_prepare()`! I have to confess that I missed the reference to it at the end of the API doc of `rte_eth_tx_burst()` function. Perhaps I also looked at too many examples that invoke `rte_eth_tx_burst()` without it. FWIW, none of the bundled DPDK examples call `rte_eth_tx_prepare()`. There is one example (the vhost sample) which directly invokes `rte_ipv4_phdr_cksum()` - perhaps due to only targeting a specific PMD. I've tested `rte_eth_tx_prepare()` and it works as expected. – maxschlepzig Sep 12 '21 at 11:29
  • Nice research of yours. If I recall correctly, this API is used by `app/test-pmd/csumonly.c`. – user10304260 Sep 12 '21 at 12:14