IBM FlashSystem 900 Storage Overview

    Overview and testing of flash storage from IBM FlashSystem 900. Photos, basic principles and some synthetic tests inside.

    image

    Modern information systems and the pace of their development dictate their own rules for the development of IT infrastructure. Solid state storage systems have long evolved from luxury to a means of achieving the SLA drive you need. So, here it is, a system capable of issuing over a million IOPS.

    image

    Specifications



    Basic principles


    This storage system is a flash array with increased speed due to the use of MicroLatency modules and optimization of MLC technology.

    image


    When I asked our presale what technologies are used to ensure fault tolerance and how many gigabytes are actually hidden inside (IBM claims 11.4 TB of clean space), he answered evasively.

    As it turned out, everything is not so simple. Inside each module there are memory chips and 4 FPGA controllers built on them Raid c variable stripe (Variable Stripe RAID, VSR).

    Module internals, two double-sided boards
    image

    image

    image

    image



    In each chip, the modules are divided into so-called layers. On each N-layer of all chips, Raid5 of variable length is built inside the module.



    If one layer on the chip fails, the stripe length decreases, and the beaten memory cell is no longer used. Due to the excessive number of memory cells, the usable volume is saved. As it turned out, the system has a lot more than 20 Tb of raw flash, i.e. almost at the Raid10 level, and due to redundancy, we do without rebuilding the entire array when a single chip fails.



    Having received Raid at the module level, FlashSystem combines the modules into the standard Raid5 (if this post gets 20 likes before January 1, I will agree with everyone about the test with forced removal of the module at maximum load)).
    Thus, to achieve the desired level of fault tolerance, from a system with 12 modules of 1.2 TB each (marking on the module) we get a little more than 10 TB.

    Web-based system
    Yes, it turned out to be an old friend (hello v7k clusters) with a terrible function of pulling locales from the browser. FlashSystem has a management interface similar to Storwize, but they differ significantly in functionality. FlashSystem software is used for configuration and monitoring, and the software layer (virtualizer) is not available as a gateway, as the systems are designed for different tasks.
    image


    Testing

    After receiving the system from a partner, we install the system in a rack and connect to the current infrastructure. Honestly, when you are holding this piece of iron in your hands, 2 units high and you understand that 1,100,000 IOPS fit inside and, at the same time, a pack of kilo-green paper, the same 2 units high, instinctively call a colleague to assist in moving it.

    image

    We connect storage according to a pre-agreed scheme, configure zoning and check availability from the virtualization environment. Next, we are preparing a laboratory stand. The stand consists of 4 blade servers connected to the tested storage system by two independent 16 Gb optical factories.

    Wiring diagram


    Since my organization leases virtual machines, the test will evaluate the performance of one virtual machine and a whole cluster of virtual machines running vSphere 5.5.

    We optimize our hosts a bit: set up multithreading (roundrobin and the limit on the number of requests), and also increase the queue depth on the FC HBA driver.

    ESXi Settings
    Our settings may differ from yours!



    On each blade server, we create one virtual machine (16 GHz, 8 GB RAM, 50 GB system disk). We connect 4 hard drives to each machine (each on its own Flash moon and on its Paravirtual controller).

    VM settings



    In testing, we consider synthetic testing with a small 4K block (read / write) and a large 256K block (read / write). SHD stably gave 750k IOPS, which looked very good for me, despite the 1.1M IOPS space figure announced by the manufacturer. Do not forget that everything is pumped through the hypervisor and OS drivers.

    Schedules of IOPS, delays and, as it seems to me, notrim
    1 VM, 4k block, 100% read, 100% random. When providing all the resources from one virtual machine, the performance graph behaved non-linearly and jumped from 300k to 400k IOPS. On average, we got about 400k IOPS:



    4 VMs , 4k block, 100% read, 100% randomly:



    4 VMs, 4k block, 0% read, 100% randomly:



    4 VMs, 4k block, 4% read, 100% random , after 12 hours. There were no drawbacks in performance.



    1 VM, 256k block, 0% read, 0% randomly:



    4 VM, 256k block, 100% read, 0% randomly:



    4 VM, 256k block, 0% read, 0% randomly:



    Maximum system throughput (4 VM, 256k block, 100% read, 0% random):





    I also note that, like all well-known vendors, the declared performance is achieved only in greenhouse laboratory conditions (a huge number of SAN uplinks, a specific LUN breakdown, the use of dedicated servers with RISK architecture and specially configured load generator programs).

    conclusions


    Pros : huge performance, ease of setup, user-friendly interface.
    Cons : Outside the capacity of one system, scaling is carried out by additional shelves. “Advanced” functionality (snapshots, replication, compression) is moved to the storage virtualization layer. IBM has built a clear hierarchy of storage systems, led by a storage virtualizer (SAN Volume Controller or Storwize v7000), which will provide multi-level, virtualization and centralized management of your storage network.

    Bottom line : IBM Flashsystem 900 performs its tasks of processing hundreds of thousands of IOs. In the current test infrastructure, we managed to get 68% of the performance declared by the manufacturer, which gives an impressive performance density on TB.

    Also popular now: