Questions tagged [rdma]

RDMA refers to "Remote Direct Memory Access," which is a set of networking technologies typically used for high performance, low latency communication.

RDMA refers to "Remote Direct Memory Access," which is a set of networking technologies typically used for high performance, low latency communication.

RDMA networks have three main attributes:

  1. Asynchronous queueing: network operations are submitted by adding requests "work queues" which are asynchronously executed by hardware; when complete, the result is returned to software in a "completion queue."
  2. Kernel bypass: userspace processes can submit operations directly to the network adapter hardware without any system calls.
  3. One-sided operations: one system can read from or write into the memory of a remote system without involving any software on the remote system.
177 questions
1
vote
3 answers

Remote Direct Memory Access and OS

I want to know the role of OS when initiating RDMA. Who initiates it OS or CPU? What happens to OS after the RDMA starts?
gpuguy
  • 4,607
  • 17
  • 67
  • 125
1
vote
1 answer

find maximum allowed ibv_reg_mr

I'm trying to diagnose a memory allocation error thrown by ibv_reg_mr() in software that I use, and my suspicion is that it's related to known problems with some Mellanox Infiniband cards where the default maximum memory that can be registered is…
jason_r
  • 33
  • 6
1
vote
1 answer

rdmacm.so: Cannot open shared object file. However, file exists in library path

I have a program that uses the infiniband rdmacm library rdmacm.so On one computer (Ubuntu server), I can run it without issues. On my dev PC (Ubuntu desktop edition), I get: ./test-client rdmacm.so: cannot open shared object file: No such file…
hookenz
  • 36,432
  • 45
  • 177
  • 286
1
vote
1 answer

Inter process communication using GA library

How (GA) Global array library (an implementation of ARMCI) is used for communication between two process located on different remote machines. Is that something similar to TCP socket programming where one process wait for data and the other…
LeTex
  • 1,452
  • 1
  • 14
  • 28
0
votes
2 answers

Programmatically retrieve infiniband device ip address

I'm trying to find programmatically the inet address of an Infiniband interface whose name is not know a priori. I'm on Linux, and I would like to avoid the parsing of ifconfig (8) output. I've read the second comment on this answer, that suggests…
Luca Martini
  • 1,434
  • 1
  • 15
  • 35
0
votes
1 answer

User-mode application that performs RDMA directly to nvme drive on Linux

We've been developing HPC applications that take advantage of the infiniband infrastructure. One of our applications exchange data that is stored in an nvme cache with other nodes, and for that it uses RDMA posts. We think we can increase the…
Caian
  • 440
  • 4
  • 15
0
votes
0 answers

Issues with using KafkaDirect for Kafka RDMA communication

KafkaDirect I'm attempting to install KafkaDirect from the GitHub repository to enable RDMA communication in Kafka. My environment is as follows: Ubuntu 20.04 Cluster : Node1, Node2, Node3 Mellanox ConnectX-3 InfiniBand KafkaDirect is an adaptation…
0
votes
0 answers

How should I achieve continuous write operation via rdma WRITE

I am a novice in learning rdma. Now I want to create a client and server, and use rdma write to continuously write data to the server. How can I implement it. I found some information, but the client is disconnected after writing an operation , and…
hhyyd
  • 1
0
votes
0 answers

I encounter a failure of modify qp to RTR state in the environment of RoCE(RDMA over Converged Ethernet)

When test the connection betwen two rdma devices on one computer, I encounter the error of modify qp to RTR: server: client: The info of RDMA device: I test the connection between mlx5_3 with gidIndex(5) and mlx5_1 with gidIndex(5) failed as…
0
votes
0 answers

How to simulate latency for RDMA?

I'm investigating the performance of RDMA under different latency settings. To be more specific, I have built a testbed with 3 servers, and connected them in a line (like A-B-C) via fibers. For TCP, I can use Linux tc tool in node B to add latency…
0
votes
1 answer

Why the RDMA SEND/RECV operation is preferred for control signal communication over the RDMA WRITE?

there are many papers mentioning that the RDMA SEND/RECV is often used for command/signal or small-size data communication. But I can't find a reasonable explanation for that. I know that the memory semantic (WRITE/READ) can reduce latency and CPU…
0
votes
1 answer

How can i take a Vec<> out of my async function in Rust?

I have been trying to make this function that takes in a list of IPs (via clientconfig stuct) and created all clients for the addresses. It should then return these so that I can use them. But I keep getting lifetime errors. I have been trying to…
0
votes
0 answers

RDMA chelsio T6225-CR Register Memory Problem

I am an engineer working on the RDMA IWARP protocol. I use chelsio t6225-cr NIC Adapter and connect 2 pc back to back. Development environment is Window10, and i use NDSPI( NDSPI Interfaces (Windows) ). i want to register 1TB Memory to my adapter by…
holee
  • 1
  • 1
0
votes
0 answers

RDMA support GPUDirect feature

NVIDIA GPUs want to GPUDirect RDMA feature. Must it use hardware RDMA NIC by Mellanox like ConnectX-5 NIC, or I can use other RDMA by other companies and adapt it to support GPUDirect feature. If so, what do I need to install? Now, I make some RDMA…
xiaobin
  • 1
  • 2
0
votes
0 answers

Is the "header-data split" enable in Linux network? (over RDMA or DPDK)

I use the UD connection in ROCE environment, I use the custom header. Therefore, a package is split into two part: header and data. So when I receive packages from other node, the data part is not sequential, even though the ring buffer for…