Two in one: Intel Optane Memory H10 (part 2)

Original author: Billy Tallis
  • Transfer
Part 1 >> Part 2

AnandTech Drive Test - The Destroyer


The Destroyer is an extremely long test that replicates application access patterns with large amounts of I / O. As when used in real conditions, drives sometimes get a short break, which allows the use of background garbage collection and cache cleaning, but these downtimes are limited to 25 ms, so there is no need to spend an entire week to conduct one test. The AnandTech Storage Bench (ATSB) tests do not include launching real applications that generated workloads, so the estimates are not very sensitive to changes in the CPU and RAM performance of our new test bench, but the transition to a newer version of Windows and fresh drivers can have a noticeable effect.



We evaluate the results of this test by reporting the average disk throughput, average I / O latency, and the total energy consumed by the drive during the test.



The Intel Optane Memory H10 actually performs better in The Destroyer with caching disabled, with the Optane fully inactive. This test does not leave much time for background optimization of data placement, and the total amount of data moved is much larger than the 32 GB Optane cache. 512 GB of QLC NAND lacks performance for timely cache cleaning.





The QLC side of the Optane Memory H10 itself has poor average latency scores of the 99th percentile, and the lack of cache memory only exacerbates the situation. Even the 7200RPM hard drive performed better.





The average read latencies for the Optane Memory H10 are worse than all TLC-based drives, but much better than the HDDs with or without the Optane cache in front of it. In the case of a QLC H10 recording, the SSD is sent to the last place when the SLC cache ends.





Optane's cache has a positive effect on the 99th percentile of the H10 read latency, bringing the drive closer to the Crucial MX500 SSD, and significantly outperforming the larger 1LC 660p QLC model. The write latency in the 99th percentile is terrible, but even with a cache overflow causing redundant writes, the H10 is not as bad as the DRAMless Toshiba RC100.

AnandTech Drive Test - Heavy


Our test for heavy loads “Heavy” gives a proportionally greater recording load than the “Destroyer”, but takes much less time. The total amount of recorded data in the Heavy test is not enough to fill the disk, so the performance never drops to a stable working condition. This test is much more applicable to indicators of daily energy consumption, and its peak performance is significantly affected by the peak performance of the drive. Detailed data on the Heavy Test can be found in the corresponding article on AnandTech. This test runs twice, once on a fully erased disk, and once after filling the disk with continuous recording.



In the Heavy test, caching unambiguously accelerates the Intel Optane Memory H10, bringing its average data rate to the range of good TLC-based NVMe SSDs if the test is performed on an empty disk. Four-wheel drive performance is still better with a cache than without it, but ultimately the behavior of the QLC NAND after filling the SLC cannot be hidden by Optane. None of the TLC-based drives slows down when full as much as the QLC drives.





The average and 99 percent latency for the H10 is roughly the same as other TLC drives, only when the test is running on an empty disk. When the Heavy Test runs on a full drive, with a full SLC cache and an idle Optane cache, the latency is even worse than an Optane cache hard drive. The average H10 delay in the case of all-wheel drive is still significantly better than when using only part of the QLC, but the Optane cache does not improve the 99th percentile of the delay at all.





The average read latency of the H10 is significantly worse when the Heavy test is run on a full drive, but it is still slightly better than the SATA SSD. The average write delay is where the QLC looks especially bad, with an integer H10 worse than that of the HDD, and with Optane caching disabled, the write delay is ten times higher than the TLC SSD.





The 99 - read latency percentile for H10 without Optane caching is a serious issue during full disk testing, but using the Optane cache brings read QoS back to a decent range for SSDs. 99 - The percentile of the recording delay looks bad without the Optane cache, and even worse with it.

AnandTech Drive Test - Light


Our Light drive test has relatively more consecutive sessions and less queue depth than The Destroyer or Heavy test, and this is by far the shortest test overall. It is mainly based on applications that are not very dependent on the performance of the drive, so the test results are more likely to show the launch time of applications and file downloads. This test can be considered as the sum of all the small delays in everyday use, but if downtime is reduced to 25 ms, it takes less than half an hour to complete it. Detailed information about the Light test can be found in the corresponding article on AnandTech. As in the case of the ATSB Heavy test, this test is run twice: on a drive that has been completely cleaned, and after filling the disc with sequential recording.



The Intel Optane Memory H10 competes with NVMe low-end drives when the Light test runs on a blank drive. Although the higher performance of a single part of the QLC itself indicates that the evaluation of the whole H10 is probably somewhat underestimated. Full disk performance is worse than all TLC-based SSDs, but still significantly outperforms an HDD without Optane cache.





The average and 99 percent latency of the Optane Memory H10 is competitive compared to the TLC NAND, when the test is performed on an empty disk, and even with a full disk, the latency indicators remain better than a mechanical hard disk.





The average write delay with a full drive is the only thing that isolates and identifies the H10 as an NVMe drive above the entry level. A TAM-based DRAMless Toshiba RC100 turned out to be worse in this scenario.





In contrast to the average delays, the 99 percentile delays of both reading and writing on the Optane H10 show that it experiences significant difficulties when filling. Optane cache is not enough to make up for the lack of SLC cache.

Random read performance


The first random read performance test uses very short batches of operations that are performed one at a time and without a queue. Disks get enough downtime between packages so that the overall duty cycle is 20%, so thermal regulation is not possible. Each package consists of 32 MB of random reads of 4 KB, of 16 GB of data on disk. The total amount of data read is 1 GB.



When testing the random read of short packets, the data is easily placed in the Optane cache on the Optane Memory H10, so it surpasses all flash-based solid state drives, but much slower than storage devices with pure Optane.

Long random read performance


The performance test of continuous random reading is similar to the test from our test suite of 2015: queue depths from 1 to 32 are checked, and average productivity and energy efficiency by QD1, QD2 and QD4 are indicated as the main indicators. Each queue depth is checked for one minute or 32 GB of transferred data, which is faster. After checking the depth of each queue, the drive is given one minute to cool, so the accumulation of heat is unlikely to affect the higher depths of the queue. Separate read operations are still 4 KB and occupy 64 GB of disk.



With a long random read test that spans a larger range of disk than the Optane cache can handle, the performance of the H10 is on par with TLC-based SSDs.

Graphs

Intel Optane Memory H10 512GB


Intel SSD 660p 1TB


Intel SSD 760p 512TB


Intel Optane SSD 900P 280GB


Samsung 970 EVO 500GB


Intel Optane Memory H10 512GB (QLC)


Intel Optane Memory H10 512GB (32GB Optane)


Intel Optane Memory M10 64GB


Team MP34 512GB


Crucial MX500 500GB


Intel Optane Memory 32GB


MyDigitalSSD SBX 512GB


Western Digital WD Black 7200RPM 1TB


Intel Optane SSD 800P 118GB


WD Black 1TB 7200RPM + Optane Memory 32GB

The Optane cache provides a small advantage compared to a clean QLC storage at low queue depths, but at higher depths H10 with caching enabled starts to take a real advantage over the QLC part. Unfortunately, the performance is still quite low, and the flash SSDs exceeded the random readability of H10.

Random Write Performance


The first performance test for random writing of short packets is structured similarly to the read test, but each packet takes only 4 MB and the total test length is 128 MB. Random write operations of 4 KB are distributed over a 16 GB disk, and are performed one at a time, without a queue.



The performance of random recording of short H10 packets with caching enabled is higher than any half of the disk individually can handle, but much less than the sum of its two parts. Adequate SLC cache on a TLC drive is still better than an Optane cache on top of QLC.

As with the continuous random read test, our stable 4 KB random write test runs up to one minute or up to 32 GB per queue depth, spanning 64 GB of disk and giving the disk up to 1 minute of downtime between the queue depths to ensure cache flushing. and cooling the disc.



With a lengthy random write test that spans a much wider range than the Optane cache can handle, the Optane Memory H10 lags behind all competitors based on flash memory. The caching software ultimately creates an additional load that gives performance much lower than the QLC part itself, using only the SLC cache.

Graphs

Intel Optane Memory H10 512GB


Intel SSD 660p 1TB


Intel SSD 760p 512TB


Intel Optane SSD 900P 280GB


Samsung 970 EVO 500GB


Intel Optane Memory H10 512GB (QLC)


Intel Optane Memory H10 512GB (32GB Optane)


Intel Optane Memory M10 64GB


Team MP34 512GB


Crucial MX500 500GB


Intel Optane Memory 32GB


MyDigitalSSD SBX 512GB


Western Digital WD Black 7200RPM 1TB


Intel Optane SSD 800P 118GB


WD Black 1TB 7200RPM + Optane Memory 32GB

Random write performance on the Optane Memory H10 is unstable, but tends to decrease with increasing queue depth. Two layers of caching, intersecting with each other, is not the best recipe for stable operation.

Sequential read performance


The first sequential read performance test uses 128 MB short data packets issued by 128 KB out of turn operations. The test averages the performance over eight packets, for a total of 1 GB of data transferred from a disk containing 16 GB of data. Between each package, the drive is given sufficient downtime to maintain an overall duty cycle of 20%.



The sequential read performance of the Optane Memory H10 is much lower than that of high-performance TLC-based drives, but comparable to low-level NVMe drives that are limited to PCIe3 x2. Optane's memory caching only provides about 10% speed increase over pure QLC speed, so this is obviously not the case when the caching drivers can effectively share access between Optane and NAND.

The second test - continuous sequential reading - uses queue depths from 1 to 32, while performance and power are calculated as the average of QD1, QD2 and QD4. Each queue depth is tested for one minute or up to 32 GB of data received from a disk containing 64 GB of data. This test is run twice: once from the drive prepared by sequential recording of test data, and again after the random recording test mixed up everything, which led to fragmentation inside the SSD that is invisible to the OS. These two estimates represent the two extremes of real disk usage, where the distribution of wear and modification of existing data will create some internal fragmentation that will negatively affect performance, but usually not to the extreme extent shown here.



In a longer sequential read test, Optane caching still does not effectively combine the performance of the Optane and NAND H10 parts. However, when reading data that has not been written sequentially, the Optane cache is of great help.

Graphs

Intel Optane Memory H10 512GB


Intel SSD 660p 1TB


Intel SSD 760p 512GB


Intel Optane SSD 900P 280GB


Samsung 970 EVO 500GB


Intel Optane Memory H10 512GB (QLC)


Intel Optane Memory H10 512GB (32GB Optane)


Intel Optane Memory M10 64GB


Team MP34 512GB


Crucial MX500 500GB


Intel Optane Memory 32GB


MyDigitalSSD SBX 512GB


Western Digital WD Black 7200RPM 1TB


Intel Optane SSD 800P 118GB

In this test, the Optane cache is a bit of a nuisance to sequential reads at shallow queue depths. But at the level of QD8 and above, it gives some advantages compared to using only QLC.

Sequential Write Performance


The data packets for the first sequential write test are structured identically to the sequential read test, with the exception of the data transfer direction. Each packet writes 128 MB as 128 KB operations performed in QD1. A total of 1 GB of data is written to a disk containing 16 GB of data.



The sequential write speed of short packets in the Optane part itself is very low, so this is the case when the QLC NAND side greatly helps the Optane H10. Therefore, the QLC H10 competes with TLC-based drives, but when the caching software interferes with the H10, SATA-level performance is obtained.

The continuous sequential write test is structured identically to the same read test, except for the direction of data transfer. The queue depth varies from 1 to 32, and each queue depth is checked for one minute or up to 32 GB of transferred data, and then up to one minute of downtime when the disk is cooled and collects garbage. The test is limited by a disk capacity of 64 GB.



The situation is generally similar to the previous test, although here some low-level NVMe fell so low that the Optane Memory H10 score no longer looks so terrible. However, the QLC part in itself is still better at continuous sequential write than the cached configuration.

Graphs

Intel Optane Memory H10 512GB


Intel SSD 660p 1TB


Intel SSD 760p 512GB


Intel Optane SSD 900P 280GB


Samsung 970 EVO 500GB


Intel Optane Memory H10 512GB (QLC)


Intel Optane Memory H10 512GB (32GB Optane)


Intel Optane Memory M10 64GB


Team MP34 512GB


Crucial MX500 500GB


Intel Optane Memory 32GB


MyDigitalSSD SBX 512GB


Western Digital WD Black 7200RPM 1TB


Intel Optane SSD 800P 118GB


WD Black 1TB 7200RPM + Optane Memory 32GB

There is no clear trend in H10 performance during the continuous sequential write test. It mainly works between the QLC and Optane layers, which means that the caching software interferes and does not allow the two halves to work together and provide better performance than each of them individually. It is possible that by allowing more time to clear the Optane and SLC caches, we will see a completely different behavior.

Mixed Random Load Performance


The mixed random read and write test includes mixes ranging from pure reading to pure writing in 10% increments. Each mix is ​​tested for 1 minute or 32 GB of transferred data. The test is carried out with a queue depth of 4 and is limited by a disk capacity of 64 GB. In the interval between each mix, the drive is given an idle time of up to one minute, so the total duty cycle is 50%.



The performance of the Optane Memory H10 in the mixed random I / O test is worse than that of any half of the drive. The test covers a wider range than the 32 GB Optane cache can withstand, so caching software attempts will ultimately be detrimental.

Graphs

Intel Optane Memory H10 512GB


Intel SSD 660p 1TB


Intel SSD 760p 512GB


Intel Optane SSD 900P 280GB

Samsung 970 EVO 500GB


Intel Optane Memory H10 512GB (QLC)


Intel Optane Memory H10 512GB (32GB Optane)


Intel Optane Memory M10 64GB


Team MP34 512GB


Crucial MX500 500GB


Intel Optane Memory 32GB


MyDigitalSSD SBX 512GB


Western Digital WD Black 7200RPM 1TB


Intel Optane SSD 800P 118GB


WD Black 1TB 7200RPM + Optane Memory 32GB

The Q10 H10 itself works similarly to the Optane cache configuration during the read-loaded half of the test, although caching makes performance inconsistent. During the write load, the configuration of the pure QLC has a significant speed compared to the full H10 device, until the SLC cache is exhausted at the very end.

Mixed sequential load performance


The mixed sequential read and write test differs from the mixed test by performing sequential access of 128 KB instead of 4 KB in random places. A sequential test is also carried out at the depth of line 1. The range of tested mixes is the same, the time and restrictions on data transfer are also the same as described above.



In the serial mixed I / O test, the Optane Memory H10 is on average slightly better than the SATA SSDs, but there remains a significant gap between the H10 and high-performance TLC drives. This is another scenario in which Optane's caching software cannot help all the time, and overall H10 performance is slightly lower than a pure QLC NAND with its own SLC cache.

Graphs

Intel Optane Memory H10 512GB


Intel SSD 660p 1TB


Intel SSD 760p 512GB


Intel Optane SSD 900P 280GB

Samsung 970 EVO 500GB


Intel Optane Memory H10 512GB (QLC)


Intel Optane Memory H10 512GB (32GB Optane)


Intel Optane Memory M10 64GB


Team MP34 512GB


Crucial MX500 500GB


Intel Optane Memory 32GB


MyDigitalSSD SBX 512GB


Western Digital WD Black 7200RPM 1TB


Intel Optane SSD 800P 118GB


WD Black 1TB 7200RPM + Optane Memory 32GB

Caching software leads to unstable performance of Optane Memory H10. But the general trend is to decrease performance when the workload becomes more and more laborious when recording. Part of the QLC without additional caching can provide greater speed in the second half of the test, because it is quite effective for combined recording.

Conclusion


The idea behind Optane Memory H10 is quite intriguing. QLC NAND needs performance improvements in order to be competitive with TLC-based SSDs, and Intel 3D XPoint memory is still the fastest non-volatile storage system on the market. Unfortunately, too many factors reduce the potential of H10. These are two separate SSDs on the same card, so on the NAND side of the drive, a certain amount of RAM is still required, which increases the cost. Caching is fully software-controlled, so the SSD NAND controller and Optane controller cannot match each other, and Intel caching software sometimes struggles to use both parts of the disk at the same time.

Some of these problems are compounded by testing conditions; Our test suite was designed with SLC write caching in mind, but not a two-level cache, which sometimes functions more like RAID-0. None of our synthetic tests succeeded in triggering bandwidth aggregation between the Optane and NAND H10 parts. Intel warns that they only optimized their caching algorithms for real storage patterns, and it’s easy to see how some of our tests show differences that can be very significant. (In particular, many of our tests only give the system the ability to use only block-level caching, but Intel software can also perform file-level caching.) But this only emphasizes that Optane Memory H10 is not a universal solution for all storage systems.

For the heaviest and most intense workloads, placing the Optane small cache in front of the QLC NAND only puts off the inevitable drop in performance. In some cases, trying to store the correct data in the cache causes more performance problems than good. However, real applications that generate such a large number of I / O operations are unlikely to work well on a 15-watt laptop processor. Adding the Optane cache couldn't magically turn a low-level solid state drive into a champion, and the Optane Memory H10 is probably never a good choice for desktop PCs that can easily accommodate a wider range of storage options than a thin ultrabook.

At lower loads, which are more typical for an ultrabook, Optane Memory H10, as a rule, competes with other budget NVMe and in good conditions can be more responsive than any NAND flash drive. For everyday use, the H10 is certainly preferable to a QLC-only drive, but compared to TLC-based drives, it looks rather weak. We did not have the opportunity to make detailed measurements of the power consumption of the Optane Memory H10, but it is unlikely that it will be able to provide better battery life than the top-of-the-line solid state drives based on TLC.

If Intel is serious about the idea of ​​caching QLC + Optane, and wants to provide a serious competitor to TLC drives, they will have to do something better than the Optane Memory H10. TLC SSDs will almost always have a more stable performance profile than a tiered device. The Optane cache on the H10 is not efficient enough for good performance under heavy workloads, and at low loads it cannot increase performance enough to give the H10 a noticeable advantage over the best TLC drives. Under ideal conditions, the peak performance of even pure QLCs looks very fast due to SLC caching. And, obviously, Intel should focus on improving performance in the worst case scenario, rather than optimizing use cases that seem almost instantaneous anyway.

Optane has achieved great success in some segments of the data warehouse market, but in the consumer market it is still looking for a suitable niche. QLC NAND is also still relatively young and incomprehensible, although recently the promise of a significant price reduction has finally begun to be fulfilled. The combination of QLC and Optane may still be able to create an impressive consumer product, but Intel will need more work than this SSD, which looks hastily done.

Thank you for staying with us. Do you like our articles? Want to see more interesting materials? Support us by placing an order or recommending it to your friends, a 30% discount for Habr users on a unique analogue of entry-level servers that was invented by us for you:The whole truth about VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps from $ 20 or how to divide the server correctly? (options are available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).

VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps until the summer for free when paying for a period of six months, you can order here .

Dell R730xd 2 times cheaper? Only we have 2 x Intel TetraDeca-Core Xeon 2x E5-2697v3 2.6GHz 14C 64GB DDR4 4x960GB SSD 1Gbps 100 TV from $ 199 in the Netherlands! Dell R420 - 2x E5-2430 2.2Ghz 6C 128GB DDR3 2x960GB SSD 1Gbps 100TB - from $ 99! Read about How to Build Infrastructure Bldg. class using Dell R730xd E5-2650 v4 servers costing 9,000 euros for a penny?

Also popular now: