I'm designing an application involving multi-node communications using Infiniband (ibv_*). What is the standard way to keep connections between nodes? I'm thinking of O(N^2) connections for all pairs of node as the easiest one, but it's kind of silly and not scalable.
Asked
Active
Viewed 79 times
1
-
Will you need RDMA operations between the nodes, or do you plan only to use send and receive operations? – haggai_e Aug 25 '14 at 06:25
-
Also, can you tell in advance what will be the communication pattern between the nodes? Perhaps you could dynamically create RC connections and avoid using O(N^2) connections. – haggai_e Aug 25 '14 at 06:27
-
Yes I need RDMA operation. – w00d Aug 25 '14 at 16:33
1 Answers
2
The question is kinda simple and short, but the real answer is VERY long...
First of all, be sure that you really need to use ibv_... stuff.
Are you using Infiniband or ROCE?
Next, analyze the expected communication pattern of your application.
You're talking about scalability, which probably means that you have a massively parallel application in mind. Do you really need to invent your own communication layer? Can't you use existing solutions? There's a whole CS field that deals with this kind of problems - HPC (High Performance Computing). Perhaps MPI/UPC/some other library will solve your problem?
If you still need to write your own ibv_... application with lots and lots of machines, then you need to consider:
- do you need RC or UD connections?
- if you have the newest Mellanox HCA (Connect-IB) then there's also an option of DC
- what are the scalability requirements?
- how sensitive is the application to latency/BW?
To summarize:
- if you need to have a massively parallel IB verbs application, and you need RC, you'd better open RC connections on-demand
- if you have to have all the RC connection opened in advance, then there's no other way - O(n^2) connections case in inevitable
- if it fits your needs, consider using UD
- check that existing solutions are not what you need

kliteyn
- 1,917
- 11
- 24