0

I am trying to get RoCe (RDMA over converged ethernet) to work on two workstations. I have installed MLNX_OFED on both computers which are equiped with Mellanox ConnectX-5 EN 100GbE adapters and connected directly to each other via corresponding cable. According to what I've read I need the Subnet Manager to be running on one of the workstations for it to be able to use RoCe between them.

When trying to run the command opensm it says it finds no local ports. I can ping both computers and it works. I can also use the command udaddy to test the RDMA and it also works. But trying to run the RDMA_RC_EXAMPLE presented in this guide https://www.mellanox.com/related-docs/prod_software/RDMA_Aware_Programming_user_manual.pdf it fails when trying to create the Queue Pairs, more specificly when it tries to change state to RTR (ready to receive).

Also some sources says that you need a RDMA service, which does not exists on my computer. And the installation of MLNX_OFED puts an exclude of ibutils-libs* in /etc/yum.conf, which I don't know if it's relevant but I noticed it.

I am running CentOS 7.7 on one of the machines and CentOS 7.8 on the other.

I'm a bit puzzled over what's faulty.

Update Here's the function that breaks when running the code.

/******************************************************************************

 * Function: modify_qp_to_rtr

 *

 * Input

 * qp QP to transition

 * remote_qpn remote QP number

 * dlid destination LID

 * dgid destination GID (mandatory for RoCEE)

 *

 * Output

 * none

 *

 * Returns

 * 0 on success, ibv_modify_qp failure code on failure

 *

 * Description

 * Transition a QP from the INIT to RTR state, using the specified QP number

 ******************************************************************************/
static int modify_qp_to_rtr (struct ibv_qp *qp, uint32_t remote_qpn, uint16_t dlid, uint8_t * dgid)

{

    struct ibv_qp_attr attr;

    int flags;

    int rc;

    memset (&attr, 0, sizeof (attr));

    attr.qp_state = IBV_QPS_RTR;

    attr.path_mtu = IBV_MTU_256;

    attr.dest_qp_num = remote_qpn;

    attr.rq_psn = 0;

    attr.max_dest_rd_atomic = 1;

    attr.min_rnr_timer = 0x12;

    attr.ah_attr.is_global = 0;

    attr.ah_attr.dlid = dlid;

    attr.ah_attr.sl = 0;

    attr.ah_attr.src_path_bits = 0;

    attr.ah_attr.port_num = config.ib_port;

    if (config.gid_idx >= 0)

    {

        attr.ah_attr.is_global = 1;

        attr.ah_attr.port_num = 1;

        memcpy (&attr.ah_attr.grh.dgid, dgid, 16);

        attr.ah_attr.grh.flow_label = 0;

        attr.ah_attr.grh.hop_limit = 1;

        attr.ah_attr.grh.sgid_index = config.gid_idx;

        attr.ah_attr.grh.traffic_class = 0;

    }

    flags = IBV_QP_STATE | IBV_QP_AV | IBV_QP_PATH_MTU | IBV_QP_DEST_QPN |

        IBV_QP_RQ_PSN | IBV_QP_MAX_DEST_RD_ATOMIC | IBV_QP_MIN_RNR_TIMER;

    rc = ibv_modify_qp (qp, &attr, flags);

    if (rc)

        fprintf (stderr, "failed to modify QP state to RTR\n");

    return rc;

}

This source RoCE Debug flow Linux says that

RoCE requires an MTU of at least 1024 bytes for net payload.

Which I guess would affect this row in the code:

attr.path_mtu = IBV_MTU_256;

When changing to IBV_MTU_1024 it compiles but gives the same error.

Fiskrens
  • 11
  • 4
  • hi and welcome to stack overflow, please note that stack overflow is for coding questions. I would suggest that you ask that very same question over at [super user](https://superuser.com/) – Ente Jul 29 '20 at 10:53
  • Thank you. I am sorry if I'm in the wrong forum. The question I posted could be code-related since I am not sure if the problem lies within the setup of the system or within the code itself in the RDMA_RC_Example.c code. The latest I've discovered is that the code example given by Mellanox is an example for Infiniband instead of Ethernet. And RoCE seems to have some restrictions compaired to Infiniband, which MAY cause the code to break. – Fiskrens Jul 29 '20 at 11:07
  • Could you report the value of `config.gid_idx`? In RoCE it is mandatory to use a GID. – haggai_e Jul 30 '20 at 06:40
  • Thank you! Yes, I found that out after much searching yesterday. That solved the problem and it's now running fine. If I understood correctly you don't need to specify the GID when using Infiniband and just only for RoCE, I had a bit trouble finding that information. Thank you for your help. @haggai_e – Fiskrens Jul 30 '20 at 07:43

1 Answers1

1

Solved

When using RoCE you need to specify the GID index as explained by @haggai_e. I don't think it's needed when using Infiniband instead of RoCE.

When running the qperf RDMA test we also needed to set the connection manager flag for it to run. I would assume it has the same function as the Subnet Manager that needs to be used when using Infiniband instead of RoCE.

Fiskrens
  • 11
  • 4
  • InfiniBand will work in many cases without a GID, as long as the two nodes communicating are in the same subnet. Using CM can help choosing the right RoCE GID when several GIDs are configured. – haggai_e Jul 31 '20 at 08:17