Infiniband: matrix for data
As the data traffic between the components of highly loaded systems grows, the problem of its accurate, but rather even point-to-point delivery is becoming increasingly acute. The ideal solution in this case could be a universal technology for the exchange of information with high bandwidth and efficient interaction between both local and network devices, that is, essentially combining them into a single matrix of the data center or network core. It's funny, but a fact: one such “matrix” (or rather, data switching technology) appeared almost simultaneously with the Wachowski brothers' film. It's about the Infiniband standard.
Infiniband technology dates back to 1999, when two competing projects were merged under the responsibility of the largest communication equipment manufacturers of the time: Compaq, IBM, Hewlett-Packard, Intel, Microsoft and Sun. From an architectural point of view, it is a switched network of high-speed connections between computing modules and storage devices of the scale of a supercomputer or computer center. When designing the Infiniband standard, the following priorities were laid in it:
- Hierarchical traffic prioritization;
- Low latency;
- Scalability
- Possibility of reservation;
Well, and probably the most important thing is the ability to choose the right one from the speed range, from high to very high. The table below shows the simplex bandwidth of Infiniband in terms of usable traffic for different modes and number of lines.
SDR | DDR | QDR | Fdr | EDR | |
1X | 2 Gbit / s | 4 Gbit / s | 8 Gbit / s | 13.64 Gbit / s | 25 Gbit / s |
4X | 8 Gbit / s | 16 gbit / s | 32 Gbit / s | 54.54 Gbit / s | 100 Gbit / s |
12X | 24 Gbit / s | 48 Gbit / s | 96 Gbit / s | 163.64 Gbit / s | 300 Gbit / s |
The Infiniband bus is serial, just like, say, PCIe or SATA, but unlike the latter, it can use both fiber and copper transmission media, which allows it to serve both internal and strongly external connections. The encoding of the transmitted data is performed according to the 8B / 10B scheme for speeds up to and including QDR and according to the 64B / 66B scheme for FDR and EDR. Infiband lines are usually terminated with connectors СХ4 (in the photo on the left) and QSFP, for high-speed links optics are now more and more used.
Infiniband promotes and standardizes the InfiniBand Trade Association, a consortium of interested manufacturers that includes IBM, Hewlett-Packard, Intel, Oracle and other companies. As for the equipment itself, that is, Infiniband adapters and switches, the leading positions in the market are occupied by Mellanox and QLogic (acquired by Intel in early 2012).
Let's take a closer look at the Infiniband network architecture using the small SAN as an example.
Infiniband adapters fall into two categories: Host Channel Adapters (HCA) and Target Channel Adapters (TCA). NSA are installed in servers and workstations, TSA - in storage devices; accordingly, the former control and transmit data, the latter execute commands and also transmit data. Each adapter has one or more ports. As already mentioned, one of the features of Infiniband is its highly accurate traffic routing. For example, the transfer of data from one store to another should be initiated by the NSA, but after the transfer of control directives, the server leaves the game - all traffic moves directly from one store to another.
The photo on the left shows the HCA QLogic QLE7340 adapter (QDR, 40 Gb / s).
As you can see, the number of connections between Infiniband subscribers is excessive. This is done to increase transmission speed and provide redundancy. A collection of end users connected to one or more switches is called a subnet; the subnet map, that is, the set of available routes between users, is located in the memory of the subnet manager - it must be at least one. Multiple subnets can be networked together using Infiniband routers.
Infiniband was developed not only as a means of optimal data transfer, but also as a standard for direct exchange of server memory contents; So, based on it, the RDMA (Remote Direct Memory Access) protocol works, which allows you to remotely receive and transfer memory areas without the participation of the operating system. In turn, a series of more specialized protocols that extend its functionality are based on RDMA.
There are also hoisting protocols on top of Infiniband's standard TCP / IP protocol stack, they are, I think, included in all software packages for Infiniband, both proprietary, of various manufacturers of network devices, and open.
Infiband Speed Chart
In principle, it can be said that Infiniband finds its application wherever large amounts of data are transmitted at high speeds, whether we are talking about supercomputers, high-performance clusters, distributed databases, etc. For example, such an active ITA member as Oracle has been using Infiniband for a long time, as it can be said the only means for internal connections in its proprietary clustering and even developed its own data transfer protocol over Infiband - Reliable Datagram Sockets (RDS). Infiniband infrastructure is quite an expensive pleasure, so it is difficult to call it widespread. But there are definitely chances to meet her personally if you go towards big gigabits per second and terabytes in your career. Then you dig deep into the subject,