I'm looking into the capabilities of fragment/packet reassembly hardware and wondering whether one could use them to perform application-level packet reassembly.
I have to receive and reassemble a sequence of 65kiB packets (hundreds of streams adding up to 200 or 400Gbit/s) that I need to reorder and assemble (amounting e.g. to large 512kiB frames) in my application before delivering them further to other computation.
Would any existing receive offload hardware be able to help here? Instead of 'just' reassembling on ip+tcp would I be able to tell it to reassemble on ip+udp+ my application level fragment/segment protocol?
Apart from using a custom FPGA, I mean.
edit: I'm working on a Mellanox Connect-x 6 dx NIC, dpdk 21.11, fw 22.32.1010