1

I'm learning how to use RDMA via Inifniband and one problem I'm having is using a connection with more than 1 thread because I cant figure out how to create another completion queue so the work completions get mixed up between the threads and it craps out, how do I create a queue for each thread using the connection?

Take this vomit for example:

void worker(struct ibv_cq* cq){
    while(conn->peer_mr.empty()) Sleep(1);
    struct ibv_wc wc{};
    struct ibv_send_wr wr{};
    memset(&wr, 0, sizeof wr);
    struct ibv_sge sge{};
    sge.addr = reinterpret_cast<unsigned long long>(conn->rdma_memory_region);
    sge.length = RDMA_BUFFER_SIZE;
    sge.lkey = conn->rdma_mr->lkey;
    wr.wr_id = reinterpret_cast<unsigned long long>(conn);
    wr.opcode = IBV_WR_RDMA_READ;
    wr.sg_list = &sge;
    wr.num_sge = 1;
    wr.send_flags = IBV_SEND_SIGNALED;
    struct ibv_send_wr* bad_wr = nullptr;
    while(true){
        if(queue >= maxqueue) continue;
        for(auto i = 0ULL; i < conn->peer_mr.size(); ++i){
            wr.wr.rdma.remote_addr = reinterpret_cast<unsigned long long>(conn->peer_mr[i]->mr.addr) + conn->peer_mr[i]->offset;
            wr.wr.rdma.rkey = conn->peer_mr[i]->mr.rkey;
            const auto err = ibv_post_send(conn->qp, &wr, &bad_wr);
            if(err){
                std::cout << "ibv_post_send " << err << "\n" << "Errno: " << std::strerror(errno) << "\n";
                exit(err);
            }
            ++queue;
            conn->peer_mr[i]->offset += RDMA_BUFFER_SIZE;
            if(conn->peer_mr[i]->offset >= conn->peer_mr[i]->mr.length) conn->peer_mr[i]->offset = 0;
        }
        int ne;
        do{
            ne = ibv_poll_cq(cq, 1, &wc);
        } while(!ne);
        --queue;
        ++number;
    }
}

If I had more than one of them they would all be receiving each others work completions, I want them to receive only their own and not those of other threads.

1 Answers1

1

The completion queues are created somewhere outside of this code (you are passing in an ibv_cq *). If you'd like to figure out how to create multiple ones, that's the area to focus on.

However, the "crapping out" is not (just) happening because completions are mixed up between threads: the ibv_poll_cq and ibv_post_send functions are thread safe. Instead, the likely problem is that your code isn't thread-safe: there are shared data structures that are accessed without locks (conn->peer_mr). You would have the same issues even without RDMA.

The first step is to figure out how to split up the work into pieces. Think about the pieces that each thread will need to make it independent from the others. It'll likely be a single peer_mr, a separate ibv_cq *, and a specific chunk of your rdma_mr. Then code that :)

Greg Inozemtsev
  • 4,516
  • 23
  • 26
  • Thank you so much just for replying, I realise its a mess but the data itself in this test is irrelevant it does the same thing with writes, one thread receives more completions than the other and the others get stuck. I have tried creating a completion queue per thread but they don't do anything if I remember correctly, I haven't done anything with RDMA for a while because I just gave up with it after trying every possible solution. – CommanderLake Sep 21 '21 at 22:12
  • I'm not sure how you expect to have multiple threads running operations like `conn->peer_mr[i]->offset += RDMA_BUFFER_SIZE;` and expect any sane answer to come out. – Greg Inozemtsev Sep 23 '21 at 21:03