AccelStor - own view on the work of All Flash
Currently, flash drives increasingly occupy the niche of storage media in the Enterprise segment. This is facilitated by both a significant reduction in their cost and an increase in the capacity of individual drives. Where, until recently, only mechanical hard drives were used, SSDs are now actively exploited. And we are talking not only about internal drives in client systems, but also about the disk subsystem of servers and data storage systems. And in this segment storage systems configurations occupy a separate place, where only SSDs are used as storage media. These are the so-called All Flash systems.
First of all, you need to understand for yourself what is All Flash storage system. It is clear that the name implies the use of only Flash drives in it. However, not all All Flash systems are the same. Conventionally, they can be divided into three subspecies.
1. Traditional storage using SSD
In fact, this is the most numerous kind of All Flash storage systems. Because for the manufacturer there is nothing easier than to equip your existing storage with SSD drives. Of course, the leading vendors, in addition to re-sticking the nameplates (All Flash storage), are also engaged in additional firmware optimization for the convenience of working with SSDs as well as increasing the speed of the system as a whole. But there are those who don’t particularly bother and just offer bundles consisting of regular storage and a set of SSDs. As a result, you can find offers on the market, ranging from All Flash NAS Qnap (we leave out the discussion of the feasibility of this solution, but, indeed, All Flash is really not to be faulted!) To the monstrous multi-player Netapp FAS.
The main advantage of such a solution is, above all, moderate cost. Of course, each vendor has its own surcharge for the brand, but in general the price of the All Flash system (talking about the “head” with the controllers) does not differ much compared to the classic storage system (compared to the cost of SSDs, it’s really a penny).
The downside is the low overall performance of the solution. All similar All Flash systems with modern hardware inside produce about 300K IOPS (4K, 100% random), we consider the recording mode for the reason that it is much harder for storage than reading. Reading indicators are, of course, much higher ) A strong negative deviation from this value is rather a serious flaw in the firmware, and higher performance indicators indicate better caching and / or firmware optimization algorithms for specific SSD models. In any case, “saturation” occurs even with the number of disks ~ 10-20. Therefore, the further addition of disks will only increase the available storage capacity, but not the speed of work.
The main reason for this performance limitation is the use of classic RAID algorithms. These algorithms were developed long ago for working with mechanical hard drives and absolutely do not take into account the features of the operation of solid-state drives. After all, an SSD, unlike an HDD, cannot just overwrite a data block. He needs to rewrite the entire page containing the block to be changed to a new location, and free the old place for a subsequent new recording. These circumstances, in addition to the standard RAID penalty, provide a huge overhead for rewriting operations.
2. All Flash arrays with proprietary hardware
To overcome the bottlenecks of traditional storage systems, it is necessary to use a completely different hardware and software architecture. An example of such solutions are Pure storage or IBM Flash System products. They have neither RAID in the usual sense (parity, of course, there is fault tolerance), nor SSDs per se (they have their own “drives" instead). The result is simply crazy performance and especially low latency rates. But the cost ... Indeed, like a wing from an airplane.
3. Software defined storage
Apart from all this “zoo” of All Flash arrays are Software Defined Storage (SDS). SDS is software that runs on regular x86 hardware and emulates storage systems. It is not in vain that we used this term in quotation marks, because Currently, the border between hardware and software controllers is very arbitrary, unlike in the old days. Modern storage systems most often use the standard x86 architecture running Linux-like operating systems. Yes, additional offload controllers can be used for some operations. But the main difference from SDS is the closed nature of both hardware and software for the user. SDS, by contrast, allows you to use almost any recommended hardware and make moderate modifications to the software components.
However, if you use SDS not just as a storage system, but as an All Flash array, then it is incorrect to give the user the freedom to choose a server platform and perform independent software installation. The main reason is the inability to guarantee the specified performance indicators (in fact, the main reason for choosing All Flash), as well as the difficulty of supporting a wide list of equipment. Therefore, the so-called appliance is present on the market - complete solutions consisting of a server platform with pre-installed and configured software and equipped with the necessary number of SSDs, which in general provides the specified performance.
Representatives of this type of solution (SDS appliance) are the heroes of our review - All Flash arrays from AccelStor .
AccelStor - own view on the work of All Flash
Company AccelStor was formed as a startup in 2014. The key investor (essentially the owner of this project) is the well-known IT giant Toshiba. Even before the commercial launch, the company attracted attention, receiving the highest awards at various events dedicated to Flash technologies. One of the top awards on their list was received at the very famous and prestigious event Flash Memory Summit (2016).
FlexiRemap technology is a special algorithm for working with SSD so as to get rid of bottlenecks in terms of performance, as well as maximize the life of drives. The main idea is to convert random write requests into sequential chains. Those. the received data blocks are combined into chains that are multiples of “pages”, and only then are written to the SSD. As a result, this approach to recording new data is consistent from the point of view of drives, which ultimately allows achieving high performance indicators.
In the process, the FlexiRemap algorithm keeps track of the demand for all data blocks. In accordance with the frequency of use, the data is automatically ranked when overwriting so that all the "hot" data is located as close as possible to each other. Then, in the process of changing blocks, this data will also be moved to new “pages” together, which again will allow using a more productive sequential recording mode on SSD compared to the traditional approach. This mechanism is similar to a kind of virtual peering, which, among other things, also speeds up the Garbage Collection, as The garbage collector will also do its job in sequential mode.
Despite the fact that RAID is not used here, data is still protected. To do this, all SSDs are divided into two symmetric groups. All I / O is evenly distributed across both groups (stripe). In addition to data, each group contains checksums so that it is possible to continue working if one drive fails. In total, the array can withstand the failure of two SSDs, which in comparison with RAID is equivalent to the level of RAID 50 from two groups.
Organization of a data array
When recording, the round robin mechanism is used, thanks to which the data is distributed as evenly as possible across all disks. In addition, each SSD has its own weight coefficient, which depends on its current recording resource. Therefore, if any disk is worn out more than the others, it will be less likely to receive new data until the resource indicators are equal. Compared to the traditional RAID method, FlexiRemap technology can significantly increase the life of drives due to their uniform use.
FlexiRemap vs RAID
Of particular note is the data retention mechanism in the event of a drive failure. In this case, the group that the SSD refused is automatically set to read-only mode. This is done to complete the rebuild process on the hot spare disk as quickly as possible. Once the group is restored, it can again participate in all types of operations. Moreover, the previously described mechanism for aligning the recording resource will automatically work.
Speaking about the SDS appliance, you need to understand that this is essentially a server with preinstalled software. Therefore, it is a priori single-controller, expressed in terms of storage systems. And although a number of tasks allows us not to resort to redundant storage system controllers, all storage vendors have long taught us that the “correct” storage system is storage with two (or even more) controllers. AccelStor also has its own answer to this - Shared Nothing technology for two nodes in a cluster.
AccelStor NeoSapphire models with two nodes can be in a single package (based on twin servers), or in the form of two separate servers. The latter can be spread to a distance of 100m from each other to create a disaster recovery. In any case, an external connection via InfiniBand 56G is used to synchronize data between nodes with an additional heart rate check via Ethernet.
Organization of synchronization between nodes
In contrast to the usual dual-controller storage, not only the controllers (nodes) with the obligatory binding in the form of cooling modules and power supplies are duplicated here, but also the data itself. Each node in AccelStor NeoSapphire is completely independent and contains a complete copy of the data thanks to continuous synchronous replication. Both nodes operate in the Symmetric Active-Active mode without the use of query transfer to each other (ALUA), as in classical storage systems. Therefore, the switching time in the event of a failure on the part of AccelStor really tends to zero. And the presence of two copies of data can significantly improve the reliability of the system compared to traditional architecture.
Continuing the topic of reliability, it is worth noting that Accelstor arrays do not cache data during write operations, because work in synchronous mode. All intermediate actions on these FlexiRemap algorithm are performed in the controller RAM. But the array will give confirmation to the host about the successful completion of the operation only after the physical record on the SSD. Therefore, Accelstor All Flash arrays do not have batteries / capacitors due to the lack of need for them.
In addition to the unique All Flash technologies, AccelStor NeoSapphire arrays also have the standard functionality for the Enterprise market : Thin Provisioning, Redirect-on-Write snapshots with the ability to backup and restore them via external CIFS / NFS folders, asynchronous replication, compression and deduplication. Separately, it is worth noting the Free Clone function to create copies of volumes that do not physically take up space, because They are essentially links to the source volume. This feature can be very useful, for example, in VDI.
Of course, there is support for all modern operating systems and virtualization platforms. There is a plug-in for VMware vSphere Web Client with the ability to manage volumes and fully implements the functionality of Free Clone.
An important advantage of Accelstor NeoSapphire as Software Defined Storage is the ability to work on a regular x86 hardware with completely standard SSDs. Yes, the manufacturer does not provide liberties for choosing a hardware platform: it does this for you. This is done primarily for guaranteed predictable performance of the solution, as well as to eliminate compatibility issues. All Accelstor All Flash arrays are assembled for a specific customer in the configuration they need and undergo rigorous testing before shipping. The standard warranty on all arrays is 3 years of NBD with advanced replacement parts. Because the vendor is present on the territory of Russia, technical support is also available in Russian.
When ordering an All Flash Accelstor NeoSapphire array, you can flexibly select the required volume. Moreover, this volume is what is really available for hosts to work, regardless of the physical organization of disk space. Please note that all models come fully loaded with disks. There are no free slots - you cannot add disks. This is all due to the same performance and reliability requirements mentioned earlier. If in the future you need to increase the volume, this can be done using the expansion shelves (available for older models). It is also necessary to determine in advance how many nodes (controllers) will be in the array, because Upgrade to the current two-mode mode is not provided.
As interfaces for all models, a choice of 10G iSCSI or 16G Fiber Channel is available. Optionally, there may also be a 56G InfiniBand. For iSCSI models, in addition to block access, the bonus is the support of the CIFS and NFS file protocols. The number of ports is determined by the given system performance so that they are not a bottleneck (usually 2-6 ports per node).
As drives, standard enterprise class SSDs are used. Most often with the SATA interface, as work with two controllers is not required. There are also models of All Flash arrays based on NVMe disks.
Using standard server platforms and SSDs can significantly optimize the cost of the solution as a whole. At the same time, AccelStor provides service on its own behalf for the entire solution, regardless of which components of which manufacturer are part of the array.
And, yes, an extremely important point: no paid licenses! All the functionality is immediately available “out of the box”. Moreover, if the functionality is expanded, new features will be available when updating the firmware.
Verification in business
AccelStor has a wide range of models with various declared performance. The smallest NeoSapphire 3401 with 8 SSDs can provide 300K IOPS @ 4K. And the top- end P710 with 24 SSDs already produces 700K IOPS @ 4K. As for NVMe models, the same performance of 700K IOPS @ 4K is achieved in NeoSapphire P310 with only 8 SSDs! And note that the indicated performance values are a record in established mode (reading and all sorts of peak values above), i.e. in the heaviest mode of operation for the array.
We tested a two- NeoSapphire H710 system with 48 SSDs (24 SSDs in each node) with an available capacity of 27 TB. Accelstor declares performance for this model no lower than 600K IOPS 4K, random write. Testing was carried out in IOmeter from three servers connected via Fiber Channel.
In All Flash synthetic tests, the array turned out to be even better than promised in the specification, which, in our opinion, is only a plus in the market segment, where any indicators are questioned (thanks to marketers torn from reality for this!).
Важно отметить, что одним из основных преимуществ алгоритма FlexiRemap является высокая производительность в режиме записи без деградации с течением времени. Т.е. достигнутый показатель в устоявшемся режиме будет таким же и через 10мин/час/… непрерывной работы. Для подтверждения этого факта мы запустили тест IOmeter (4K, 100% random write) на несколько часов (использовался один хост). Да, это действительно так: производительность практически не меняется с течением времени.
Выбирая All Flash массив, большинство пользователей по умолчанию предпочитают рассматривать в качестве кандидатов традиционные СХД, укомплектованные SSD. И если производительность ~280K IOPS (4K, random write) вас устроит, то вы мыслите в правильном направлении. Вот только задачи бизнеса все чаще требуют, чтобы оборудование работало на все 146%. И с обычной СХД выше головы, увы, не прыгнут, а какой-нибудь IBM Flash System стоит заоблачных денег. И вот здесь All Flash массивы AccelStor будут как нельзя кстати. Достойная производительность, высокая надежность, гибкий выбор конфигурации и адекватная техподдержка – это далеко не полный перечень достоинств данных массивов. Добавьте к этому полное отсутствие скрытых платежей за лицензии и более длительный срок использования SSD — и вы получите не просто интересный продукт, а достойный инструмент в работе вашего бизнеса.
So, AccelStor’s already taken place under the sun in the ultrafast array market will inevitably expand. And, who knows what peaks they can reach.