IBM FlashSystem 820 Storage Overview and Testing

    Today, we face an extremely difficult task - to start a series of articles covering network equipment designed to work in data centers. Basically, the materials of the series will be devoted to various storage systems of leading vendors, however we do not want to limit ourselves exclusively to SAN devices and plan to devote several articles to consideration of switching equipment and servers. For whom is this material cycle intended? First of all, we want to introduce our customers to the equipment that they can place on the technology sites of SAFEDATA data centers, to help with the selection of exactly the hardware that is needed. And secondly, we would like to share the information obtained as a result of testing and operation, which, of course,

    It should be noted right away that SAFEDATA customers do not have to purchase and maintain expensive equipment on their own, these tasks can be delegated to managers and engineers of the company, taking the required devices for rent . This series of articles will allow you to decide on a specific model of a switch, router or storage that is most suitable for solving the task.


    We decided to start the series of reviews and tests with the IBM FlashSystem 820 storage system , which demonstrates amazing performance and reliability.

    Description


    The constant development of information technology leads to the fact that the performance of servers and storage systems is becoming increasingly important. The storage tasks are no longer limited to simple storage of user data, today they are required to provide highly efficient access to the information posted on them. High access speeds and low response times can provide devices built on the basis of flash memory. The IBM FlashSystem 820 NAS uses flash memory built on the basis of eMLC (enterprise MLC), which is more reliable than MLC (Multi Level Cell) by adding ECC and increasing the number of write cycles up to 30000 (regular MLC supports up to 10,000 erase / overwrite). Also, according to the manufacturer, modules based on SLC memory (Single Level Cell) with the number of erase / rewrite cycles equal to 100000 are supported. There are two modifications of FlashSystem 820, which differ in the amount of pre-installed flash memory: 10 or 20 TB. Installed flash memory modules can be combined into a RAID0 or RAID5 array. The main technical characteristics of IBM FlashSystem 820 storage systems are presented in the table below.
    CharacteristicValue
    Model9831-AE2
    Flash memory typeeMLC
    Modification10 TB20 TB
    Available RAID0 Capacity12.4 TB24.7 TB
    Available RAID5 Capacity10.3 TB20.6 TB
    Minimum Recording Delay25 μs
    Minimum read delay110 μs
    100% read 4 kb525,000 IOPS
    100% write 4 kb280,000 IOPS
    100% Read 256 Kbytes3.3 GB / s (Fiber Channel), 5 GB / s (InfiniBand)
    100% write 256 kb2.8 GB / s (Fiber Channel), 2.8 GB / s (InfiniBand)
    Nutrition300 watts
    Cooling1,023 British thermal units (BTU) / hour
    Connection interfaces4 ports 8 Gbps FC, 4 ports 40 Gbps QDR IB
    Maximum Volume Supported (LUN)1024
    Supported RAID Levels0, 5
    Case Dimensions (W x D x H)1U x 432 mm x 638 mm
    Weight13.3 kg
    Management interfacesHTTP, SSH, Telnet
    ReliabilityHigh-availability hardware configuration
    Two-dimensional RAID technology in
    Variable Stripe RAID flash modules at the
    RAID 5 module level at each system module
    Hot-plug flash modules
    Redundant interfaces

    Perhaps now is the time to clarify what we mean by the abbreviations KB, MB, GB and so on. By KB we mean 1024 bytes, by MB we mean 1024 KB ... That is, a GB will be 2 ^ 30 bytes, not 10 ^ 9. This time, we decided to abandon the formally more correct kibi, mebi, gibi and tebi prefixes, i.e., the KiB, MiB, KiB and TiB prefixes, so as not to embarrass some of our readers who are not yet ready for such consoles. In the next article, we will immediately use the correct abbreviations.

    The difference between the modifications is in the used flash memory modules, the 20 TB modification has dual.



    We also decided to list the features and the benefits they provide.

    • IBM Variable Stripe Redundant array of independent disks (RAID) technology allows you to simultaneously maintain performance and reliability without reducing the capacity available for use.
    • Architecture without a single point of failure provides enterprise-class reliability and maximum efficiency in even the most demanding data centers.
    • Superior flash technology provides capacity combined with performance with eMLC flash drives.
    • 2D RAID technology, hot-swappable flash modules, and redundant components with a built-in auxiliary power supply improve data availability and IT infrastructure productivity.

    VSR technology (Variable Stripe RAID) provides data protection at the level of a page of memory, block or even the whole chip. This protection allows you to avoid replacing the entire flash module in situations where any particular chip fails. As a result, the frequency of FlashSystem 820 Series storage services is significantly reduced.

    The key elements of the IBM FlashSystem 820 NAS architecture are presented in the diagrams below.




    FlashSystem 820 is not the only IBM storage system built around flash modules. This vendor also offers models FlashSystem 710, 720 and 810. The ratio of performance and availability for all four models is presented below. Speaking about improving accessibility, it is worth noting that the FlashSystem 720 and 820 models implement 2D Flash RAID data protection technology.


    At the time we were preparing our test bench for the first measurements, IBM released a new storage system - FlashSystem 900. This model offers even greater performance: up to 1.1 MIOPS for read operations, 800 KIOPS for mixed mode and 600 KIOPS for data recording. The IBM FlashSystem 900 uses an MLC flash developed jointly with Micron. In the new flash memory, one of the main disadvantages of MLC chips has been eliminated - the number of erase / rewrite cycles supported: according to the manufacturer, the life cycle of new chips is now not inferior to that for eMLC memory. At IBM, they decided in principle to abandon the use of eMLC flash memory in new models of storage systems.

    Let us now return directly to the discussed model - FlashSystem 820 - and consider the possibilities for setting it up.

    Customization


    IBM FlashSystem 820 NAS is managed using a java application launched by the browser when accessing the device via HTTP or HTTPS. At the entrance, you must enter a username and password, and also indicate whether to use encryption.


    After entering the correct credentials, the administrator opens the main window of the Monitoring Utility program, which allows you to configure and collect statistics from several FlashSystem storage systems at the same time.



    We will not begin to consider in detail all the possibilities of this control system, but dwell on the most interesting in our opinion.

    The Options menu allows you to add storage systems in manual or automatic mode, connect to multiple FlashSystems at once, perform multiple firmware updates, and display statistics for several storage systems.


    Utility settings are mainly related to the choice of a method for detecting storage systems in a local network.


    It is worth noting here that detection refers to the automatic search and addition of exactly the FlashSystem control modules. The Actions menu allows you to turn off or restart devices, save or restore the configuration, disconnect the control connection or remove the storage system from the list.


    The two tabs located below (“Recent Event Log” and “Task Monitor”) display a list of events that occur with the device, as well as a set of completed administrator tasks.


    What actions can be performed with the device itself? The “Logical Units” menu group allows you to manage “virtual disks” (LUNs): create partitions of the required size, delete unnecessary ones, and change access settings.








    The "Storage" group contains all flash drives installed in the storage. You can install flash drives with a volume of 1 TB and 2 TB. All installed flash cards can be combined into a RAID array.




    Two types of arrays are supported: maximum capacity (analogue of RAID0) and RAID5. To increase reliability, it is recommended to use a RAID5 array called IBM 2D Flash RAID, as the RAID5 array is also used inside the module, which ensures the fault tolerance of the entire module when one of ten chips fails (nine store user data, one is used for parity). It is worth noting that when creating a RAID5 array, not all modules are used to store user data: ten modules contain data, one is used to store parity information and another is not used, it is designed to hot-replace a failed module. Thus, user data will remain available even if two modules fail, naturally, not simultaneously.


    The Interfaces group displays information about installed interface modules and their ports.




    Information on air temperature and system components, power settings, fan operation and battery status can be found in the Environmental group. You can verify the health of the batteries installed inside the case using a periodically conducted test, which can also be scheduled here.



    Settings for the control module parameters are collected in the Management group. Here, the administrator can specify the IP parameters of both network cards and the address of the DNS server.




    Time synchronization is controlled via NTP using the Date / Time item, while the Users item allows you to manage local or domain users.





    The Firmware item is intended for updating the firmware. Unfortunately, new firmware versions are available only to users who have received special support (service contract), so we will not be able to show the entire process of updating the firmware. Here, perhaps, it is worth saying that SAFEDATA customers renting equipment get at their disposal devices with the latest and most stable firmware. In addition, an additional equipment support service can be ordered, which allows you to perform all the necessary work on servicing servers, data storage systems and other network equipment during the entire period of use of the data center services, which also includes up-to-date firmware support.



    The Services group allows the administrator to specify settings for connecting to the FlashSystem using the SNMP, Telnet, and SSH protocols. It is also worth noting that when certain events occur, the administrator can be sent an alert by e-mail, which is also configured here.







    The statistics group is responsible for collecting and displaying statistics. Here, the administrator can view the current use of the system, build graphs of the load on the network interfaces, display the number of operations performed per second, and so on. A variety of counters are available for each component of the system.






    The FlashSystem 820 storage system is equipped with two management modules (MCP - management controller ports). The module can be in active or passive mode. MCP in active mode manages LUNs, and also allows you to view the status of system components. A passive module is used to provide fault tolerance in the event of an active MCP failure. The Cluster tab displays information about the MCP control modules and allows you to switch the active MCP, that is, the "cluster" is assembled from two modules installed in one FlashSystem 820 system.



    The system log information can be accessed using the Logs item.


    This concludes our discussion of the web interface of the IBM FlashSystem 820 storage system, and in the end we would like to mention that device management is possible not only using a browser and graphical interface, but also using the command line when connecting using Telnet and SSH protocols.


    Now let's test IBM FlashSystem 820.

    Security testing


    Connecting to the IBM FlashSystem 820 can be done for two fundamentally different purposes: to receive data and to control. The FlashSystem 820 instance we tested was equipped with four FibreChannel ports for data transfer, therefore we should only talk about device security in the context of the security of the control module. Of course, we understand that the control interface will be connected to some kind of internal secure network, access to which will be limited by access lists on routers or in other ways. However, at the same time, we were interested in the issue of the availability of available services in principle. For testing, the Positive Technologies XSpider 7.7 network security scanner (Demo build 3100) was used. In total, four open ports were discovered: TCP-22 (SSH), TCP-23 (Telnet), TCP-80 (HTTP) and TCP-443 (HTTPS).


    We now proceed directly to the load testing.

    File tests


    Before we begin to describe the methodology of the load testing and provide the results, we would like to emphasize that we did not set ourselves the goal of obtaining the maximum number of MB / s or IOPS. The only goal we pursued was to find out what real speeds would be available to users when one or two servers are connected to the repository. As we will show later, the values ​​we obtained are not limitations of the FlashSystem 820, but of the equipment used for the tests. Thus, in order to "squeeze" the maximum out of the device, you will need to connect several productive servers to it simultaneously. And although this storage system was created for ultra-fast service of requests from customer applications,

    The main parameters of the test bench are listed below.

    • Server number 1. IBM x3850, 32 cores.
    • Server number 2. IBM x3650 M4, 24 cores.
    • QLOGIC QLE2564 FC card.
    • Brocade 300 FC Switch
    • IBM FlashSystem 820 9831-AE2 storage.

    The first measuring tool was CrystalDiskMark utility version 3.0.3. First, we made measurements to connect using one FC-link.


    Then we connected the second FC-link and enabled balancing by adding the “Multipath I / O” component.


    We performed the following measurements using Intel NASPT version 1.7.1. When testing Intel NASPT with the msconfig utility, the RAM available to the operating system was reduced in accordance with Intel recommendations in order to reduce the impact of local caching on the measurement results. First, we measured data access speeds with one FC link between the server and the storage for three file systems: NTFS, FAT32 and exFAT.


    As you can see from the diagram above, in some tests we came close to the performance of the 8GFC environment (considering 8b / 10b coding) - 8 Gb / s. For the NTFS file system, we decided to compare system performance for single and double server connections to storage. The speeds obtained using double connection turned out to be lower than the values ​​that we were able to obtain with a single connection. We must admit, at this stage we were somewhat discouraged. Of course, later we discovered the reason - high processor load with MPIO turned on, but first things first.


    The next testing tool was Intel IOMeter, which allows you to create an absolutely synthetic load in the form of a clean read or write, and use patterns that emulate the behavior of various servers and workstations. The synthetic load is represented by three tests: 100% read, 100% write, as well as 50% read and 50% write. The diagram below shows the results of each of the three synthetic tests when connected using one FC link, depending on the size of the data block. The test involved two servers connected to the Flash System 820 using an FC switch.


    As you can see from the presented diagrams, with an increase in the block size, the data access speeds increase until they reach the obvious limit - the performance of one Fiber Channel interface, that is, 8 Gb / s.

    In addition to measuring storage performance in terms of bandwidth, we also measured the number of operations performed by the device per unit time (IOPS). The diagrams below show the results of the same measurements, expressed in thousands of IOPS.


    For any write operations, a peak in performance is visible with a block size of 4 kilobytes, which is due to the size of the internal data structure.

    Naturally, we decided to make the same measurements, but when connecting to the store using two FC links, so that the Fiber Channel interface was not a bottleneck in our test bench.


    As expected, we found a significant increase in system performance. However, when reading blocks of 4 and 8 kilobytes in size, we were close to the performance of two FC interfaces, which makes us think that the performance of the IBM FlashSystem 820 network storage is even higher, that is, when using all four FC ports, even our two test servers could high access speeds to user data are obtained. The diagrams below show the device’s performance in thousands of IOPS when connected using two FC links.


    As with a single-link connection, there is a pronounced peak in performance on 4 kilobyte data blocks.

    The time has come to move away from synthetic measurements and make our stand simulate the load produced by various servers and workstations. The following four were chosen as storage usage scenarios: Database, Fileserver, Workstation, and Webserver. Storage performance when connecting using a single FC interface is presented below.


    We repeated the same measurements, but using two FC connections.


    The results, as for synthetic tests, clearly indicate that when connecting to the FlashSystem 820 using one FC-link, the FC connection itself will be the bottleneck.

    In all of the following tests in this section, we used Server No. 1 and read data blocks of 4 kilobytes in size. First, we compared system performance depending on the depth of the request queue when using a single thread.


    Perhaps it’s worthwhile to give some explanations to the last chart, which shows the execution time of the request to read the data block from the storage. While the system is not overloaded, reading is carried out in approximately 200 μs and practically does not depend on the number of physical connections, which is a very worthy result.

    Then we decided to find out the dependence of system performance on the number of threads with a queue depth of 4.


    The only thing we would like to mention here is the increased processor load when using MPIO. That is, when balancing the load between several physical FC interfaces, the Windows operating system spends precious processor clock cycles. So, for example, in almost all tests without MPIO, the processor load did not exceed 10% (ranged from 5-7%), while the use of MPIO led to almost 100% CPU utilization. Looking ahead, we note that a similar effect was not observed when testing FlashSystem 820 together with the Oracle database running on the Oracle Linux Server release 6.6 operating system.

    With this, we complete the file tests of the IBM storage system and move on to measuring storage performance when working with the Oracle database.

    Database tests


    Test Purpose
    To determine the main performance characteristics of the IBM FlashSystem 820 storage system when using its Oracle database application.

    Testing Methods
    To test the performance of the storage used by the Oracle database, the utility SLOB2 (The Silly Little Oracle Benchmark) from Kevin Closson was used. He is a former performance architect Oracle Exadata Machine. More detailed information about the utility is presented in the author’s blog.

    Testing involves the implementation of read / write operations by multiple threads. A diagnostic AWR (Active Workload Repository) report is collected for each test. It serves as a source of information about the actual performance of the disk subsystem as part of the test.

    Platform
    Operating system: Oracle Linux Server release 6.6 with the kernel 3.8.13-55.1.5.el6uek.x86_64.
    Processor: Intel Xeon CPU E7-4830 @ 2.13 GHz, 16 cores, 32 processors
    File system block size: 4 KB

    Test parameters
    24 measurements were performed. The following varied:
    - The number of threads (from 1 to 128 with a twofold increase).
    - The ratio of the number of read operations to write operations (Read Only, Write Only and “70% Read, 30% Write”).
    - The size of the block used by the database.
    - Using multipath I / O.

    Test results
    4 KB, without MPIO.





    4 Kbytes using MPIO.




    8 Kbytes, without MPIO.




    8 Kbytes using MPIO.





    IOPS and bandwidth
    Maximum read performance of 62K IOPS was achieved with a block size of 4 Kbytes on 32 threads and disabled Multipath I / O. The throughput was 242 MB / s. Measurements of 8 KB blocks with Multipath I / O disabled show an average of 5.6% worse reading IOPS. The throughput at the same time increases by an average of 88% to 457 MB / s on 32 streams.

    Multipath I / O
    The inclusion of Multipath I / O negatively affects the results. With a block size of 8 Kbytes, we get a loss of about 20% when reading in 16 threads and 25% when testing on 32 threads. In mixed mode, 30/70 peak IOPS also decreases by 20% and 25% for block sizes of 4 KB and 8 KB, respectively. However, as the number of streams decreases, Multipath I / O contributes to improved IOPS and throughput. So, with a block size of 4 KB, measurements on 4 and 8 streams showed 12% better results in mixed read / write mode.

    30/70 mode
    The case “30% of operations is writing, 70% is reading” corresponds to the mode of real work of the database as part of a multi-user application. “Saturation” occurs at 16 threads. In this case, the read delay increases from 240 μs on 1 stream to 400 μs on 16. The longest response of the medium was obtained when testing on 64 streams - 950 μs. On 128 threads, the delay is less due to the fact that there was a struggle for resources between the threads of the test system. As a result, the generated load on the media in read / write mode has decreased.

    Delay
    The maximum read delay for a database for modern HDD drives is 4 ms, which is an order of magnitude greater than the maximum values ​​obtained for FlashSystem 820. It should be noted that for relational databases, the delay parameter is critical, since the task of obtaining the necessary records from tables usually consists of a series consecutive accesses to data from files. That is, to get a block containing the desired row from the table, you first need to find this information by reading out several index blocks. The high latency allowed for file storages can have a strong impact on database performance.

    Correlation with actual operation
    According to our experience in operating an application for servicing the network infrastructure of a telecommunications operator with hundreds of active users during business hours, it generates an average load equivalent to 8 synthetic test flows with a write operation share of 30%. In this mode, the IBM FlashSystem 820 showed a read latency of 4 KB data block equal to 270 μs, which is an excellent result.

    Conclusion


    In conclusion, I would like to note that the IBM FlashSystem 820 network storage fully met our expectations for the speed of access to user data and insertion delays. Flash memory itself, as well as a variety of devices built on its basis, are increasingly entering the corporate equipment market. Flash memory provides maximum performance, which, coupled with minimal delays, will satisfy even the most demanding customers who place the highest demands on the performance and reliability of storage systems. A modern solution that can demonstrate unsurpassed performance is the IBM FlashSystem line of NAS, with one of which - FlashSystem 820 - we met quite tightly today. IBM FlashSystem 820

    Storage SystemsDesigned to accelerate the work of a variety of enterprise applications, including databases for online transaction processing (OLTP) and online data analysis (OLAP), virtual desktop infrastructure, applications for technical computing and scalable cloud infrastructures. This system provides the highest level of performance per gigabyte of data, which allows organizations to quickly analyze data using traditional tools as well as using new technologies developed for Big Data analysis. However, to store large amounts of archival information that does not require high-speed access, the IBM FlashSysem 820 storage system is clearly redundant. Regardless of whether you are able to independently determine the necessary equipment or not, the company's specialistsSAFEDATA will help you with the selection, purchase, placement, configuration and support of any network equipment and storage systems.

    We would also like to thank IBM for the equipment provided for testing and assistance in the setup and measurement process.

    Also popular now: