What is Intel Optane? Part 1. Optane Memory

    With this post I would like to open a small series of articles devoted to Intel Optane products based on 3D XPoint technology. My cursory review of Russian-language sources showed that there is no good material on this issue; in addition, from the comments on our announcements, I was convinced that there is a deep misunderstanding of why all this is necessary and why it is implemented in this way.


    3D XPoint Technology


    Let's start with a brief summary of the 3D XPoint technology itself (read as “three di cross point”). I apologize right away - we do not currently disclose detailed information about the technology. In addition, the focus of the reviews will be on the final products, rather than on the technology itself.

    First, although the technology is a joint development of Intel and Micron, the implementation of the technology in the form of products is separately managed by each of the vendors. Thus, everything that I will talk about products based on 3D XPoint, is relevant only to Intel products.

    Secondly, 3D XPoint is not NAND, it is not NOR, it is not DRAM, but a completely different beast. Without revealing the details of the physical implementation of the memory, I will describe the key characteristics, as well as the differences between 3D XPoint from NAND and DRAM.

    • Unlike NAND, there is no binding of write operations to pages and no binding of erase operations to blocks. With 3D XPoint, we can access data at the physical level at the individual cell level. In addition, we do not need to delete data before the write operation - we can overwrite the data, which allows us to get rid of read-modify-write operations and greatly simplify garbage collection. This reduces latency and increases the number of I / O operations per second (IOPS); in addition, write operations are almost as fast as read operations. Finally, the endurance of 3D XPoint memory is much higher compared to NAND (such an effect as electron leakage from cells does not exist here). To summarize, 3D XPoint is faster and more durable than NAND. However,

    • Unlike DRAM, 3D XPoint allows you to create devices with a higher density of data storage, is a non-volatile type of memory and, at the same time, cheaper. Of the drawbacks in this comparison, 3D XPoint as a technology for implementing memory is somewhat slower than DRAM (note that we are comparing technologies, not products based on these technologies).

    All of the above concerned 3D XPoint as such - this, however, is less important for users than the characteristics of specific devices based on 3D XPoint. Thus, our conversation goes on to describe Intel Optane products based on this technology. Let's start with a description of what Intel Optane is. In short, this is a brand for all Intel products based on 3D XPoint technology. If we explain in more detail, Intel takes 3D XPoint “wafers”, conducts its own testing and selection of memory chips, independently develops the design of the end device - creates an SSD controller, PCB layout, firmware; tests and validates the end device, brings it to the market - this is all hidden under the words "Intel Optane".

    Intel Optane


    At the moment, 2 fundamentally different products are officially announced and launched on the market: Intel Optane Memory - for client use models - and Intel® Optane SSD DC P4800X - for server use. In this article, we will analyze the client product in more detail, while the server product will be the subject of the next review.

    So, Intel Optane Memory. The first thing to understand about this product is that despite its name, it is not DRAM, but NVMe SSD in the M.2 2280-S3-BM form factor.
    Top view - under the sticker there is 1 3D XPoint chip (this is a 16GB version, 32GB has 2 3D XPoint chips - the pads for the second chip are visible):

    image

    The module is single-sided, so the back side is empty:

    image

    The device complies with the NVM Express 1.1 specification. At the moment, 16GB capacities are released on the market (one 16GB 3D XPoint memory chip is used) and 32GB (two 3D XPoint memory chips are used with 16GB each). From the interesting design details:

    • the controller is an internal development of Intel
    • the design does not use DRAM
    • only 2 PCIe gen3 lanes are used, not 4 lanes, as many would expect
    • declared wear resistance - 100GB of recorded data every day for 5 years

    Performance test


    Now about performance

    image
    (the performance of the 32GB version is higher due to the fact that 2 3D XPoint memory chips are used against one chip in the 16GB version)

    It would seem that the performance in terms of bandwidth and IOPS is not impressive - however, the dog is not buried here. The thing is that this performance data was measured at a queue depth of 4 - unlike other SSDs, which are usually measured at a queue depth of 32 or higher. It is at shallow lines that Optane is most visible. For clarity, here is a graph of the performance of different types of devices at different queue depths *:

    image

    Moreover, as our internal tests show, the vast majority of tasks that an ordinary user faces at home or in the office have a queue depth of 1 to 4 (for more details, see below), and SSD specifications are written using loads with a queue depth of 32 (for SATA) and more (for NVMe). The difference is very clear.

    However, Intel does not position the use of Optane Memory as a regular SSD for obvious reasons - the capacity of the devices is not enough for user tasks (with the exception of some interesting options, such as a small but fast and reliable bootable drive for Linux, or scratch disk for Adobe Photoshop, or a small but fast cache with Intel Cache Acceleration Software, or an interesting solution, described here) All the power of the Intel marketing apparatus is aimed at promoting a new acceleration technology (roughly speaking - caching, but this is not an exact definition) of a slow SATA-drive (be it a hard drive, solid-state drive or even some hybrid models) with the fast Optane Memory module.

    This usage model imposes restrictions on supported hardware and OS:

    • 7th generation Intel Core processor or later
    • Intel 200 Series Chipset or later (full list here )
    • A BIOS that integrates the RST UEFI driver version 15.5 or later (15.7 for the X299 chipset series). Yes, BIOS legacy mode is not supported - Optane Memory requires UEFI boot.
    • Windows 10 64-bit
    • Intel Rapid Storage Technology Driver 15.5 or later
    • Bootable SATA-drive (it will be accelerated by Optane Memory). Only GPT markup is supported.
    • 5MB of free space at the end of a SATA drive - this is needed for RST metadata

    It is configured like this:

    1. We make sure that the motherboard BIOS supports Optane (see above; now all “Optane Memory Ready” boards on 200 series of chipsets are shipped with BIOS that supports Optane Memory, however, you can still find boards from previous lots on the market - they will need to be updated with BIOS )

      And yes, Intel has done a tremendous amount of work with board manufacturers - all boards that support Optane Memory have such a nameplate on the box:

      image
    2. We take a system with a SATA drive on which Windows 10 64-bit is installed (the drive must be connected to a SATA port, separated from the Intel AHCI controller in the chipset, otherwise RST will not see it), the markup should be GPT.

    3. The Optane Memory module is connected (the drive must be inserted into the M.2 slot with the separated PCIe lines from the chipset, supporting the "remapping" of PCIe lines to the Intel AHCI controller built into the chipset).

    4. The utility is downloaded from here (you can choose the standard RST utility, which allows you to manage both configurations with Optane Memory and regular RST arrays, or a simplified version of the utility that allows you to only turn on and off the Optane Memory configurations and see statistics).

    5. The utility is installed, it automatically changes the SATA mode in the BIOS to RST / Optane mode (this requires one reboot of the system), it also enables acceleration using Optane Memory (this requires a second reboot of the system). As a result, instead of 2 disk devices, the system will see only one thing - the so-called. Optane Volume.

    6. PROFIT! Namely:

      • Faster loading of the operating system;
      • Speeding up most I / O operations (essentially caching, but with reasonably smart algorithms).

    Principle of operation


    We’ll also talk a little about how this all works.

    Firstly, at the time of activation of Optane Memory, the RST driver will transfer the files necessary for loading the OS, as well as the file table to the fast Optane Memory drive. The key here is just transferring, not copying. The mechanics of the RST driver are such that not all data in the cache on the fast device will be copied to the slow device without fail. This increases the overall system performance and, in addition, solves the problem of data synchronization. However, as you can see, a physical failure of Optane Memory is likely to lead to loss of access to data on the SATA drive. Due to the fact that data transfer occurs immediately upon activation of Optane Memory, the very first boot of the system will be faster than before Optane Memory (this is especially noticeable if the hard drive is faster than SATA SSD - however,

    Secondly, during the operation of the RST system, the driver will continuously produce caching. And here there is one important difference between Optane Memory modules of different capacities - on a device with a capacity of 16 GB only block-level caching is supported, on a device with a capacity of 32 GB - block-level caching and caching at the file level (both work simultaneously). In the case of block caching, the decision to cache a block occurs instantly at the time of the I / O request. In the case of file caching, the driver monitors the frequency of access to files and puts all this in a special table, which then (at the time of system downtime or according to the user's schedule) is used to determine which files remain in the cache, which are deleted and which are added.

    Both types of caching use quite clever, in my opinion, decision-making algorithms for caching - I can’t describe them deeply here, but for general understanding, I note that, for example, video files are not cached (yes, the driver looks at the file extension), in consideration the file size is accepted, the type of load is determined - the preference for caching is given to random access rather than sequential access, which makes sense due to the extremely slow operation of hard drives on random access operations, etc. On the Internet, I met some negative comments on the topic that “the cache will instantly clog up with data”, “16GB is not enough for anything” and the like - as a rule, these are reviews from people who have never tested Optane Memory. I have not heard negative reviews about the performance of such a solution from any of our partners,

    A few very important points.

    • If you need to connect the SATA drive to another system when the system acceleration is enabled using the RST driver and Optane Memory, you must either transfer the entire configuration (SATA device + Optane Memory, while making sure that the new system supports Optane Memory), or first turn off acceleration (this is done by pressing a single button in the utility - in this case, at the time of shutdown, the data from the cache will be transferred to the SATA device, the RST metadata will be deleted, the Optane Memory device will be cleared).

    • Disk cloning will not work when acceleration is enabled with Optane Memory, because no utility can work with RST metadata. Direct cloning of a section with metadata will not be enough - the fact is that the metadata is tied to the serial numbers of the Optane Memory and SATA device. There are no difficulties with backups at the file system level.

    Why is it necessary


    Now it's time to talk more about why all this is needed. Let's start with a more detailed analysis of the loads experienced by systems of ordinary PC users. Even before the development of the Optane Memory product was completed, as part of the Intel Product Improvement Program, my colleagues conducted a study on what ordinary users do with a computer at home and at work. Results - the number of actions of various types performed by users (averaged data for 1 day of using a PC):

    image

    All these events are closely related to the performance of the system disk, and, as a rule, they require random access to data, which hard drives cope extremely poorly. Thus, the use of Optane Memory can significantly speed up the execution of each of the above actions.

    However, you ask why should I buy Optane Memory to speed up the hard drive, if I can buy 128GB SATA SSD for the same money, put the OS and key applications on it, and just use the hard drive for other data? Here, on the one hand, the question of convenience is if you have at least some basic skills to be able to choose where to install the OS / applications (I suspect that all GT readers fall into this category, however, I can assure you that, for example, my parents, like most PC users, are not capable of this), and you won’t be lazy to do this for every application (especially problematic for games - under the current requirements for disk space, 128GB will clog under the OS and 1-2 games), then from this point of view, a hybrid configuration of SSD + HDD may be for Al convenient.

    However, keep in mind that with Optane Memory no manual data transfer is required - as soon as you stop using one application and begin to use another more actively, the necessary data will be quickly added to the cache. On the other hand, let’s recall the graph that I presented above - performance depending on the depth of the queue. On small queues, the latency of data access on the Optane Memory is much lower compared to the SATA SSD. Inside Intel, we measured how deep the queue is used by various applications - here are the results:

    Queue depth when using applications:

    image

    image

    Queue depth when starting applications:

    image

    image

    Queue depth distribution for a typical corporate user’s working day (measured on Intel employees occupying different positions in the company):

    image

    image

    Thus, queue depth distribution for different user workloads:

    image

    And we have already seen how much better Optane Memory handles work in shallow queues.

    Comparison of system performance with HDD versus the same system with HDD + Optane Memory:

    image

    Another interesting comparison is the same test, but in a system without Optane Memory there are 2 times more RAM:

    image

    And, in fact, this is a very valid comparison. Although some types of loads require a large amount of RAM, their lion's share of the requirements for large amounts of memory does not. Thus, for many users it may make sense to put 4 GB of memory instead of 8 GB, and invest the money saved in accelerating the storage system.

    Conclusion


    To summarize, I recall that Optane Memory can be used as a standalone SSD, but this is not the main use model. All the magic happens when it is used as an accelerator for a slow hard drive (or even a SATA SSD) - a relatively small investment of money can speed up the system performance several times on most user loads. This is achieved due to both the hardware (Optane Memory has significantly lower access delays compared to other SSDs on the market, the performance on small queues is much higher than alternative solutions), and the software - the RST driver uses quite advanced logic for caching operations (and in this is the difference from the previous technology - Intel Smart Response Technology).

    I am very interested to know the opinion about the product and the solution as a whole from the comments - however, I would like to avoid negative opinions because of a lack of understanding of the solution or lack of experience in using it. If in doubt, ask better before criticizing.

    PS in the next article we will analyze a server product based on 3D XPoint technology - Intel Optane SSD DC P4800X Series - coupled with the Intel Memory Drive Technology software solution.

    * All tests mentioned in this article were conducted internally by Intel. All tests with Optane Memory were carried out on 7th generation Intel Core processors, queue depth tests using the 6th generation Intel Core processor. System configuration used for tests:

    image

    Also popular now: