800G Modules New Arrival!

800G Modules New Arrival!

800G Modules New Arrival!

What Is Remote Direct Memory Access (RDMA)?

Share the Post:

With the continuous expansion of high-performance data centers, Remote Direct Memory Access (RDMA) technology has emerged, mainly for HPC applications. 

 

RDMA technology provides a more efficient way to exchange data by accessing remote memory directly, bypassing traditional operating systems and CPUs. This article will give you a detailed introduction to the basic concepts, working principles, technical characteristics, and wide application of RDMA technology in modern computing.

 

What is RDMA?

RDMA (Remote Direct Memory Access) technology can transfer data from one server to another, or from storage to server, with minimal CPU usage.

 

To send data, traditional applications need to encapsulate TCP/IP through the OS. Then pass through the main cache and the NIC cache before sending it out.

 

This results in two limitations:

Limitation 1: TCP/IP stack processing can bring latency of 10 microsecondsWhen the TCP stack receives and sends packets, the kernel needs to switch multiple contexts, and each switch takes 5-10 microsecondsIn addition, it requires at least three copies of the data and relies on the CPU for protocol workThis results in a fixed latency of 10 microseconds just for on-protocol processingProtocol stack latency becomes the most obvious bottleneck

Limitation 2: The CPU load on the server remains high due to TCP stack processing larger the network scale. The higher the network bandwidth, the greater the scheduling burden on the CPU when sending and receiving data, resulting in a continuous high load on the CPU.‍‍

 

Inside the data center, between hyperscale distributed computing and storage resources. If traditional TCP/IP is used for network interconnection, it will occupy a large amount of computing resources of the system and cause I/O bottlenecks. This will not be able to meet the network requirements of higher throughput and lower latency.

 

RDMA is a high-bandwidth, low-latency, low-CPU consumption network interconnection technology, which overcomes many difficulties of traditional TCP/IP networks.

 

  What is RDMA 

 

Principles of RDMA

Background: In order to eliminate the bottleneck of computing tasks brought by traditional network communication, we want faster and lighter network communication, so RDMA technology is proposed.

 

Principle: RDMA provides low latency using stack bypass and zero-copy technology.

 

Effect: Reduces CPU usage, reduces memory bandwidth bottlenecks, and provides high bandwidth utilization. RDMA provides an IO-based channel that allows an application to read and write directly to remote virtual memory through an RDMA device.

 

The principle of RDMA

 

The design of RDMA technology mainly has the following principles/characteristics:

 

1 CPU Offload(CPU Bypass)

Without CPU intervention, the application can access the remote host’s memory without consuming any CPU from the remote host (there may be ambiguity here, and the CPU should be notified). Remote host memory can be read without the need for process (or CPU) involvement on the remote host. The cache of the remote host’s CPU is not filled with the memory content that is accessed.

 

2 Kernel Bypass

Kernel bypass means that an application can directly transfer data in the user mode without the need to switch between the kernel and user modes.

 

3 Zero Copy

The main task of zero copy is to prevent the CPU from copying data from one storage to another. In TCP/IP communication, the transmission of data between hosts requires frequent copy operations (UserSpace Buffer, Socket Buffer, NIC Buffer), which undoubtedly increase the transmission delay. RDMA, on the other hand, has the characteristics of zero copies.

 

One of the most important aspects of RDMA technology is that each application has direct access to the virtual memory of the devices in the cluster. It means that the application can perform data transfer directly. Without involving the network software stack, data can be sent to or received directly from the buffer without being replicated to the network layer.

 

4 Asynchronous Interface

In RDMA, all interfaces provided are asynchronous communication interfaces, so that the separation of computing and communication can be more convenient when programming.

 

5 RDMA communication protocol

Currently, there are three communication technologies that support RDMA:

 

IB(InfiniBand)

 

RoCE(RDMA over Converged Ethernet)

 

iWARP(internet Wide Area RDMA Protocal)。

 

All three technologies can be used using the same set of APIs, but they have different physical and link layers.

 

RDMA Application-ULP

 

Applications of RDMA

RDMA technology has been widely used in many fields due to its superior performance. Especially in scenarios that require large-scale data transmission and high-speed computing.

 

1 High Performance Computing (HPC):

In supercomputers and large-scale computing clusters, RDMA technology helps improve overall computing performance by accelerating data transfer between nodes. 

In computationally intensive tasks such as weather prediction and genomics simulation. RDMA has become a key technology to improve computational efficiency.

 

2 Data Center & Cloud Computing:

Distributed applications in modern data centers and cloud computing platforms require fast and efficient data exchange. RDMA improves the efficiency of data transfer between virtual machines, storage devices, and compute nodes, reduces latency, and improves overall performance.

 

3 Distributed Storage System:

RDMA is becoming more and more widely used in distributed storage systems. Distributed storage systems, such as Ceph, use RDMA to achieve efficient data transfer between storage devices, which significantly improves the performance of storage systems.

 

4 Database Acceleration:

RDMA technology is also widely used in the database field. By reducing the communication latency between database nodes, it can significantly accelerate the query and write operations of distributed databases. Improving the processing capacity of large-scale databases, especially for transactional databases that require high-speed responses.

 

Conclusion

 

As a revolutionary network communication technology. Due to its advantages of low latency, high throughput and zero copy transmission. RDMA technology is widely used in various industries. Whether it’s high-performance computing, distributed storage, cloud computing, and database acceleration, it has shown great potential.

 

MVSLINK is a reliable provider of optical network solutions to build a fully connected, intelligent world through innovative computing and networking solutions.

이 양식을 작성하려면 브라우저에서 JavaScript를 활성화하십시오.