My InfiniBand Learning Journey

I’m excited to finally get hands-on with InfiniBand and RDMA technologies. After years of reading about how these protocols power supercomputers and AI training clusters, I’m ready to experience them firsthand.

The Setup

I’ve ordered two Mellanox MHGH29-XTC ConnectX VPI dual-port adapters along with SFF-8470 cables to directly connect them. These VPI (Virtual Protocol Interconnect) cards support both native InfiniBand and Ethernet modes at 20Gb/s DDR, which gives me the flexibility to experiment with both technologies on the same hardware.

Why InfiniBand?

There’s only so much you can learn from documentation and benchmarks. I want to understand what makes RDMA (Remote Direct Memory Access) special by actually seeing the performance difference. I want to configure subnet managers, troubleshoot IPoIB networking, and experience what happens when you bypass the CPU for data transfers.

What I’m Planning to Learn

My two-node Proxmox setup will let me explore:

  • InfiniBand fundamentals – OpenSM subnet managers, IPoIB configuration, and understanding the full stack
  • RoCE experimentation – Switching between InfiniBand and RDMA over Ethernet modes
  • Storage clustering – Testing Ceph performance over high-speed fabric
  • Live VM migration – Experiencing the difference 20Gb/s makes for virtual machine mobility
  • RDMA applications – Eventually diving into applications that can leverage RDMA directly

Why This Excites Me

For under $100 in used hardware, I’m building a miniature version of what runs in enterprise data centers and HPC installations. The principles are the same, the software stack is identical, and the learning is directly applicable to real-world scenarios. Whether it’s understanding Azure’s RDMA storage, AWS’s EFA, or NVIDIA’s AI training networks, these concepts are everywhere in modern high-performance computing.

There’s something deeply satisfying about working with enterprise-grade hardware that’s now accessible on the used market. These ConnectX cards powered serious infrastructure, and now they’re perfect learning tools.

What’s Next

Once the cards arrive and everything’s connected, I’ll be documenting my journey—the wins, the challenges, and everything I discover. I’m particularly curious about comparing InfiniBand versus Ethernet mode performance, experimenting with MTU settings, and seeing how live VM migration performs at these speeds.

If you’ve been curious about InfiniBand but thought it was out of reach, I hope this shows that with a modest investment, you can build a legitimate learning platform for technologies that run the world’s fastest systems.


Have you experimented with InfiniBand or high-speed networking? I’d love to hear your experiences and any tips for getting started!