How to build storage systems with rocket traction on standard iron? SDS RAIDIX Hardware Platform Architecture
RAIDIX - software storage or SDS (Software Defined Storage), which allows you to build reliable, efficient and fault-tolerant data storages based on standard server hardware.
In this article we want to talk about what RAIDIX requirements for hardware, describe the deployment options for our SDS, provide examples of hardware configurations of storage systems based on RAIDIX and their possible applications.
equipment requirements
The following server hardware is required to deploy SDS RAIDIX:
- 1-2 Intel Xeon processors of the appropriate model and the required amount of RAM;
- one or more SAS HBA adapters for connecting internal and / or external disk baskets; hardware RAID controllers with RAIDIX are not compatible;
- one or more interfaces for synchronizing the cache in a dual-controller configuration; There are several options: SAS, InfiniBand, Ethernet; duplication of interfaces is possible; in a single-controller configuration, these interfaces are not needed;
- Interfaces for connecting to a SAN and / or NAS: Ethernet, InfiniBand, FC; direct connection to hosts (clients) via SAS is possible;
- Interfaces for traffic management and heartbeat supports the use of dedicated or shared with other types of traffic Ethernet ports; enough bandwidth from 100Mb / s; for "heartbeat" it is recommended to use dedicated interfaces with direct connection between controllers, this is only necessary for a dual-controller configuration;
- any standard HDD SAS / SATA models, without restrictions on volume, speed, form factor or manufacturer;
- A server platform suitable for installing the above equipment.
To connect a large number of disks, it is supposed to use external disk shelves connected via SAS. It is recommended that you use internal and external disk enclosures with support for hot swapping drives.
There is a compatibility sheet, a list of recommended and tested equipment.
Deployment options
RAIDIX offers two deployment options: single and dual controller. In the first version, RAIDIX software is installed on one physical server acting as a storage controller. Disks are combined into a fault-tolerant RAID array, but the server itself and some of its components form single points of failure. This may be acceptable for non-critical tasks.
The dual-controller configuration involves installing RAIDIX software on two identical physical servers, each of which becomes a storage controller. It can be separate server platforms, or a single platform with two server nodes (cluster-in-a-box). Both controllers are physically connected to a single disk pool located on internal and external disk baskets. RAIDIX combines the two servers into a fault-tolerant active-active cluster, the controller cache is synchronized via dedicated interfaces.
In normal mode, the load is evenly distributed between two controllers - half of the volumes created on the disk array are served by one controller, the other half - to the second controllers. If for some reason one of the controller nodes fails, the entire load will automatically switch to the "survivor" controller without interruption or data loss. This solution eliminates the presence of single points of failure and is suitable for critical projects that are sensitive to downtime.
Dual-Controller RAIDIX Platform
AIC HA401-LB2 is a good example of a server platform for dual-controller RAIDIX storage configurations . This is a 4U platform for highly available storage servers (cluster-in-a-box) with two identical server nodes, redundant power supplies and an internal disk basket on a 24 HDD 3.5 ”hot-swappable. Each server node (storage controller) supports two Intel Xeon processors, up to 2TB of RAM and 6 PCIe slots. Both server nodes are equipped with a pair of integrated 1GbE ports and a pair of integrated 10GbE ports. This is enough to deploy a very productive storage system with capacities from two dozen to several hundred drives. This platform can be called one of the recommended, it is successfully used in many projects based on RAIDIX.
A pair of built-in 1GbE ports is convenient for managing storage and transmitting “heartbeat” between controllers. A pair of integrated 10GbE ports can be used to connect to a storage network: iSCSI or NAS protocol.
Each AIC HA401-LB2 platform controller has 6 PCIe 3.0 expansion slots. To maximize storage throughput, these PCIe slots must be evenly distributed for three types of connections, which require the installation of appropriate adapters:
- 2 adapters for connecting storage to the storage network;
- 2 adapters for synchronizing the controller cache;
- 2 adapters for connecting external disk shelves.
AIC HA401-LB2 supports the installation of only Low-Profile PCIe adapters. In order to connect external disk shelves in solutions based on RAIDIX, Broadcom SAS HBA adapters (LSI) are required. This manufacturer in the Low-Profile format only releases PCIe 3.0 x8 adapters.
The theoretical maximum bandwidth of the PCIe 3.0 x8 bus is 7.9GB / s. Practice shows that the real bandwidth of such an interface is not more than 6.5-7.5 GB / s. If you average this value to 7GB / s, then with two adapters for connecting external disk shelves you can squeeze 14GB / s - this is the maximum possible RAIDIX bandwidth for this hardware platform.
With the above distribution of PCIe slots, from this platform you can directly connect from two to eight disk shelves:
- two SAS HBA adapters per controller are available,
- 2 ( SAS 9300-8E ) or 4 ( SAS 9305-16E ) SAS interfaces (mini-SAS HD) per SAS HBA adapter depending on the model.
Total from two to eight SAS interfaces per controller. Thus, taking into account the internal AIC HA401-LB2 disk cage for 24 disks, the use of two 4U disk shelves with 60 3.5 '' SAS / SATA HDDs allows you to organize storage on 144 disks, the total height of the equipment in the rack is 12U. Connecting eight of these shelves makes it possible to build storage systems on this platform with a total capacity of 504 HDDs 3.5 '', the total height of the equipment in the rack is 36U. When using 10TB drives, the useful capacity of such a storage will be up to 4.2PB.
A further increase in volume and increased throughput of RAIDIX storage is possible using hardware platforms with a large number of x8 PCIe slots (and / or PCIe x16 support) to install the required number of network and disk adapters.
Tasks that do not require large performance and storage volumes allow you to limit yourself to fewer adapters; you do not need to use all 6 PCIe slots if there is no such need.
For example, if the AIC HA401-LB2 internal disk basket is enough for the project, it will be enough to put one adapter per controller to synchronize the cache. If necessary, you can add 1-2 network adapters to the controller.
If you connect one disk shelf to 60 HDDs to get 10GB / s throughput (for 2 controllers), you can do with four PCIe slots per controller:
- 1 adapter for connecting storage to the storage network (2 10GbE ports + 2 built-in 10GbE ports, totaling 40GB / s or 5GB / s per controller);
- 2 adapters for synchronizing the controller cache;
- 1 adapter for connecting external disk shelves.
Thus, SDS RAIDIX allows you to:
- Design the optimal storage for each specific task;
- select only the necessary components;
- provide for the possibility of expansion.
As a result, the end customer of the solution receives maximum flexibility in the design and the absence of overpayment for unnecessary components.
RAIDIX Storage Configuration Examples
Dual-controller RAIDIX configuration on 24 HDD 3.5 ”
Maximum throughput 3-4GB / s : 24 SAS drives 7200rpm, 150-200MB / s throughput from 1 drive, 2 RAID-6 or RAID-7.3 groups of 12 drives.
Useful capacity : 182 TB for RAID-6 and 164 TB for RAID-7.3
Using two RAID groups of 12 disks allows you to bind them to separate controllers and remove from each up to 1.8-2 GB / s. The storage network can be organized on iSCSI 10GbE, then 10GbE ports built into the server platform can be used to connect storage systems to the initiators - 2 ports per controller.
If desired, you can make one large group of RAID-7.3 or RAID-6 with 24 disks, but only 1 controller will be active. Accordingly, two built-in 10GbE ports may not be enough, you will have to install additional network interfaces on each controller: on a dual-port 10GbE card or FC adapters.
To connect the internal disk basket, you can use the 3008 SAS HBA adapter built into the platform (motherboard). To synchronize the cache with a margin, one Broadcom SAS 9300-8e SAS HBA adapter per controller is enough .
Specification for 24 3.5 ”10TB HDDs with 10GbE Integrated Interfaces
Component | Model | Quantity, pcs |
---|---|---|
Server platform | AIC HA401-LB2 | 1 |
CPU | Intel Xeon E5-2620 V4 8core 2.1Ghz | 2 |
RAM | Crucial by Micron DDR4 16GB | 4 |
Bootable system media | Intel SSD DC S3500 Series (160GB, 2.5 '' SATA 6Gb / s) SSDSC2BB160G401 | 4 |
SAS HBA adapter | Broadcom SAS 9300-8e | 2 |
Cables for cache synchronization | mini-SAS HD (SFF-8644) to mini-SAS HD (SFF-8644) | 2 |
HDD | HGST Ultrastar HE10 (3.5``, 10TB, 256MB, 7200 RPM, SAS 12Gb / s) | 24 |
RAIDIX License | On 26 disks, dual-controller | 1 |
Dual-controller RAIDIX on 84 HDD 3.5 ”
Maximum throughput . To the server platform used in the previous specification, it is necessary to add an external disk shelf for 60 3.5 ”HDD. We remove 3.5GB / s from the internal shelf, and 6.5GB / s from the external. Total, you need to give and synchronize 10GB / s.
Useful capacity : 691.22 TB for RAID-6 and 654.84 TB for RAID-7.3.
To connect the internal disk basket, we continue to use the 3008 SAS HBA adapter built into the platform (motherboard). To connect an external basket and cache synchronization, you will need three Broadcom SAS 9300-8e HBA adapters per controller: one for connecting disks, two for sync.
Connecting storage to the storage network can be organized in several ways, depending on the infrastructure and project requirements.
Option 1, iSCSI 10GbE . In order for the storage system to send out 10GB / s (or 80GB / s), you will need at least 4 10GbE ports per controller. Considering the 2x 10GbE ports built into the platform, you will need to install one 2Go 10GbE adapter on each controller. However, in this case, with the fall of one controller, the second will be able to give out only half the bandwidth of the array - 40GB / s. Therefore, ideally, you need to install three 2-port 10GbE adapters per controller - 80Gb / s from the controller.
Specification for 84 HDD 3,5 ”10TB, external interfaces 10GbE
Component | Model | Quantity, pcs |
---|---|---|
Server platform | AIC HA401-LB2 | 1 |
CPU | Intel Xeon E5-2637 v4 4core 3.5Ghz | 4 |
RAM | Crucial by Micron DDR4 32GB | 8 |
Bootable system media | Intel SSD 240Gb S3520 SSDSC2BB240G701 Series | 4 |
SAS HBA adapter | Broadcom SAS 9300-8e | 6 |
Cables for cache synchronization | mini-SAS HD (SFF-8644) to mini-SAS HD (SFF-8644) | 4 |
HDD for internal basket | HGST Ultrastar HE10 (3.5``, 10TB, 256MB, 7200 RPM, SAS 12Gb / s) | 24 |
External disk shelf | Disk shelf 60X10TB 4U60 G1 1ES0093 HGST | 1 |
10GbE Adapters | Intel Ethernet CNA X710 Series dual port 10GbE | 6 |
RAIDIX License | On an unlimited number of drives, dual-controller | 1 |
Option 2, FC 16Gbps . Compromise option on 4 16GbFC ports per controller. We get 128Gb / s for two controllers and 64Gb / s in case of failure of 1 controller (not 80Gb / s needed ideally, but also not bad).
Specification for 84 HDD 3,5 ”10TB, external interfaces 16GbFC
Component | Model | Quantity, pcs |
---|---|---|
Server platform | AIC HA401-LB2 | 1 |
CPU | Intel Xeon E5-2637 v4 4core 3.5Ghz | 4 |
RAM | Crucial by Micron DDR4 32GB | 8 |
Bootable system media | Intel SSD 240Gb S3520 SSDSC2BB240G701 Series | 4 |
SAS HBA adapter | Broadcom SAS 9300-8e | 6 |
Cables for cache synchronization | mini-SAS HD (SFF-8644) to mini-SAS HD (SFF-8644) | 4 |
HDD for internal basket | HGST Ultrastar HE10 (3.5``, 10TB, 256MB, 7200 RPM, SAS 12Gb / s) | 24 |
External disk shelf | Disk shelf 60X10TB 4U60 G1 1ES0093 HGST | 1 |
16GbFC Adapters | QLE2672-CK - Fiber Channel 16Gb HBA dual port Qlogic QLE2672, PCIe 3.0 x8, 16/8/4, 2xSFP + SR | 4 |
RAIDIX License | On an unlimited number of drives, dual-controller, with FC support | 1 |
Dual-controller RAIDIX configuration on 264 HDD 3.5 ”
Maximum throughput is 13-14 GB / s . It is determined by the bandwidth of PCIe 3.0 x8 (6.5-7 GB / s per slot).
Useful capacity : 2.183 PB for RAID-6 and 2.092 PB for RAID-7.3. We use the capacity of the internal disk basket of the platform and 4 external disk shelves 4U on 60 HDD 3.5 ''.
To connect the internal disk basket, we continue to use the 3008 SAS HBA adapter built into the platform (motherboard). To connect external shelves, we use two Broadcom SAS 9305-16e HBA adapters per controller. To synchronize the cache, you will need two Broadcom SAS 9300-8e HBA adapters per controller. In total, four out of six PCIe slots are used on the controllers.
Connecting storage to the storage network can be organized in several ways, depending on the infrastructure and project requirements.
Option 1, iSCSI 10GbE . For each controller, 2 dual-port 10GbE adapters can be installed, taking into account two built-in ports, we get six 10GbE interfaces per controller. The total bandwidth of network connections of the platform will be 15GB / s, which with a margin covers the total bandwidth of the configuration (13-14GB / s). However, if one controller crashes, the system bandwidth will drop to 7.5GB / s, since only half of the 10GbE ports will remain.
Specification for 264 HDD 3,5 ”10TB, external interfaces 10GbE
Component | Model | Quantity, pcs |
---|---|---|
Server platform | AIC HA401-LB2 | 1 |
CPU | Intel Xeon E5-2643 v4 6core 3.4Ghz | 4 |
RAM | Crucial by Micron DDR4 32GB | 16 |
Bootable system media | HGST Ultrastar (2.5``, 600GB, 128MB, 10000 RPM, SAS 12Gb / s) HUC101860CS4204 | 4 |
SAS HBA adapter | Broadcom SAS 9300-8e | 4 |
SAS HBA adapter | Broadcom SAS 9305-16e | 4 |
Cables for cache synchronization | mini-SAS HD (SFF-8644) to mini-SAS HD (SFF-8644) | 4 |
HDD for internal basket | HGST Ultrastar HE10 (3.5``, 10TB, 256MB, 7200 RPM, SAS 12Gb / s) | 24 |
External disk shelf | Disk shelf 60X10TB 4U60 G1 1ES0093 HGST | 4 |
10GbE Adapters | Intel Ethernet CNA X710 Series dual port 10GbE | 4 |
RAIDIX License | On an unlimited number of drives, dual-controller | 1 |
Option 2, FC 16Gbps . We install 4 16GbFC ports per controller. The total bandwidth of network connections of the platform will be 16GB / s, which with a margin covers the total bandwidth of the configuration (13-14GB / s). When a single controller crashes, system throughput decreases to 8GB / s.
Specification for 264 HDD 3.5 ”10TB, 16GbFC external interfaces
Component | Model | Quantity, pcs |
---|---|---|
Server platform | AIC HA401-LB2 | 1 |
CPU | Intel Xeon E5-2643 v4 6core 3.4Ghz | 4 |
RAM | Crucial by Micron DDR4 32GB | 16 |
Bootable system media | HGST Ultrastar (2.5``, 600GB, 128MB, 10000 RPM, SAS 12Gb / s) HUC101860CS4204 | 4 |
SAS HBA adapter | Broadcom SAS 9300-8e | 4 |
SAS HBA adapter | Broadcom SAS 9305-16e | 4 |
Cables for cache synchronization | mini-SAS HD (SFF-8644) to mini-SAS HD (SFF-8644) | 4 |
HDD for internal basket | HGST Ultrastar HE10 (3.5``, 10TB, 256MB, 7200 RPM, SAS 12Gb / s) | 24 |
External disk shelf | Disk shelf 60X10TB 4U60 G1 1ES0093 HGST | 4 |
16GbFC Adapters | QLE2672-CK - Fiber Channel 16Gb HBA dual port Qlogic QLE2672, PCIe 3.0 x8, 16/8/4, 2xSFP + SR | 4 |
RAIDIX License | On an unlimited number of drives, dual-controller, with FC support | 1 |
Note
The above configuration examples for management traffic and heartbeat use the built-in 1GbE ports.
The examples of RAIDIX configurations discussed do not require the use of spare disks to get the maximum usable storage capacity.
RAIDIX supports the allocation of spare disks and provides switching to them in case of failure. The use of spare disks and their number depends on the project conditions and is determined at the discretion of the customer, it is difficult to give general recommendations. In the absence of free slots for spare disks, it is recommended to keep the required number of disks in the “cold” reserve for their quick manual replacement.
Scope of application
Installing RAIDIX on the hardware platform options described above offers the following benefits:
- Maximum performance at sequential load, incl. multithreaded . This is relevant for tasks where throughput (GB / s) is the key parameter, and not the IOPS value, which is important for random access. At the same time, high performance is guaranteed not only for a single load stream, but also for many parallel competing flows within the same storage.
- High fault tolerance (availability) . Support for a dual-controller storage configuration, eliminating the presence of single points of failure. Support for RAID groups with error-correcting coding, guaranteeing data integrity and availability while failing up to two (RAID-6) or more (RAID-7.3 and RAID-N + M) drives.
- Large usable capacity and high storage density . RAIDIX allows you to efficiently work with large RAID groups with error-correcting coding for 12-24 disks. At the same time, large-capacity HDDs of 6-12TB can be successfully used.
- No drop in performance during disk failure, quick reconstruction . The problem with most storages is a sharp drop in performance under load when disks fail in RAID groups with checksums (RAID 5 or 6) when the array is in a degraded state. This can lead to service outages. Reconstruction of disk groups in such solutions takes a long time, the longer it takes, the greater the risk of failure of new disks and data loss. RAIDIX error-correcting coding technologies are implemented in such a way that storage performance is maintained even if the number of drives allowed by the configuration is lost. At the same time, reconstruction is several times faster than that of competitors.
- Possibility of modernization and vertical scaling . RAIDIX-based storage systems support granular expansion within the system by installing additional disks and disk shelves, adding HBA adapters, network interfaces, increasing processor power and RAM. The possibility of modernization involves the replacement of the above components with newer and more advanced in case of their physical and moral obsolescence. As flexible as possible based on the needs of the end customer.
This may be necessary in such areas as:
- CCTV
- media industry
- Backup
- HPC (Supercomputers)
If the task does not require high throughput, while a large volume and storage density are required, you can limit yourself to the integrated 10GbE ports and slightly reduce the requirements for the processor and RAM. This may be relevant for file and content repositories, archives. It is necessary to proceed from the conditions of a specific project and select the optimal solution.