Requirement : I have a cluster of 4 machines and I want to get the collective latency and bandwidth for RDMA Write & RDMA Read operation in full mesh.
I used ib_write_lat/ib_read_lat for latency and ib_write_bw/ib_read_bw for bandwidth. But those are p2p and do not work on clusters.
I want to use a tool that is already available rather as with home grown tools publishing results will be a problem.