Monsters after the holidays: AMD Threadripper 2990WX 32-Core and 2950X 16-Core (Part 5)
  • Transfer
Part 1Part 2Part 3Part 4 → Part 5

Thermal comparisons and XFR2: Do not forget to remove the plastic from the cooler!

Each machine pursues goals with different priorities: performance, consumption, noise, thermal performance, or cost. It is very difficult to reach everyone at once, so the choice of two or three goals is a good idea. How to lose in ALL FIVE DIRECTIONS? .. Welcome to my world. The world in which I first tested the 32-core AMD Ryzen Threadripper 2990WX, forgetting to remove the plastic from my liquid cooler.

Do not assemble the system after a long flight.

Almost all new coolers, air, liquid and water blocks, come complete with gaskets, foam, screws, fans and a set of instructions. Depending on the manufacturer and type of packaging, the lower part of the processor cooler will be prepared in two ways:

  1. Pre-applied thermal grease
  2. Small self-adhesive plastic tape to protect polishing during transport

Meet in our review a massive air cooler Wraith Ripper, produced by Cooler Master, but promoted by AMD as the base cooler for the new Threadripper 2 processors. Thermal grease is thickly applied on all its bases. When I tried to take photos, I misled.

Also included in our review is the Enermax Liqtech TR4 liquid cooler with a tube of thermal paste. The lower part of the unit, in contact with the CPU, was covered with protective self-adhesive plastic tape.

TechTeamGB Twitter example

So, the time of confession. Our review kit arrived a day earlier than me. The action took place during my journey from the UK to San Francisco at the Flash Memory Summit and Intel Datacenter Summit. In my suitcases, I brought an X399 motherboard (ASUS ROG Zenith), three X399 chips (2990WX, 2950X, 1950X), an X299 motherboard (ASRock X299 OC Formula), several Skylake-X chips, a Corsair AX860i power supply, RX 460, a mouse , keyboard, cables - simply components to assemble two systems and use the monitor in a hotel room for testing. After an 11-hour direct flight, two hours on passport control, and more than an hour in a Uber taxi to my hotel, I put together a system with 2990WX.

I did not remove the plastic on the Enermax cooler. I did not notice this. I even applied thermal paste to the processor and did not suspect anything, even when I tightened the screws.

I tuned the system to the maximum memory frequency supported, installed Windows, installed security updates, installed tests, and started the system all night while I slept. I did not even suspect that the plastic remained attached. In the morning, the test set has already finished work. Having performed some additional tests, such as measuring the latency of the base frequency, I went to replace the processor with 2950X. It was at this time that I performed an expressive facepalm.

Having seen the thermal paste smeared on the processor and plastic, I realized that I would have to start everything up again. After removing the plastic, put in the processor, set up the system, this time with the best thermal profile.

Thermal performance is important.

The goal of any system is to maintain it in the desired “temperature window” for stable operation: most processors are designed to work properly at temperatures up to 105 ° C, after which they are turned off to avoid destructive thermal damage. When the processor drives electrons through the circuits and does all the necessary things, it consumes energy. This power is lost as heat, dispersed from the chip in two main directions: socket and cooler.

AMD Threadripper processors have a thermal interface material between silicon matrices and a heat sink - indium-tin soldering. A direct metal-to-metal bond is needed for direct heat exchange. Modern Intel processors use silicone thermal paste instead of this layer, which transfers heat worse, but has one important advantage - it can live much more thermal cycles. As the metals heat up, they expand: two metals, bound together, with different coefficients of thermal expansion, going through many heating cycles, will crack and lose efficiency. Thermal paste eliminates this problem. In addition, thermal grease is cheaper. So the choice of thermal interface is a compromise between price, durability and performance.

A processor cooler is located above the heat sink, but there is another thermal interface between them, the user can choose. The cheapest option is conventional silicone thermal grease at a price of a cent per gallon, but performance enthusiasts can choose silver based thermal grease or another mixture with good thermal characteristics. Usually the ability of the paste to the distribution under pressure is a positive quality. Supporters of extreme speeds can use a layer of liquid metal, similar to the soldering variant, which almost forever ties the processor to the cooler.

So, what happens if you suddenly apply a few microns of thermally useless plastic between the heatsink and the processor cooler?

First of all, the heat transfer will be terrible. This means that thermal energy remains in the paste, forcing the processor to absorb heat, raising the temperature. This is, in essence, the same case when the cooler is overloaded with a large processor - the absorption of heat by the processor becomes a real problem. This leads to an accelerated increase in temperature until the temperature gradient equals the heat output. The processor gets too hot, the emergency mode turns on for an emergency thermal situation, reducing the voltage and frequency to ultra-low levels. Performance drops to the bottom.

What does the user see in the system? Imagine that your processor is running at 600 MHz when rendering, instead of the good base 3125 MHz (see the previous page). Base temperatures are higher, load temperatures are higher, body temperatures are higher. But you can dry wet clothes, so that the heat is not lost. A small overheating does not harm the processor, but a large amount can make it very weak.

Ultimately, this problem harms AMD more than you could imagine. The way in which AMD implements its turbo modes in new processors is no longer a reference table with a list of "loaded cores -> turbo". It depends on the power, current and thermal limits of the chip. If there is a place for growth, the AMD platform will add frequency and voltage. This thermal trim is what AMD calls XFR2, or eXtended Frequency Range 2.

At AMD's Tech Day for Threadripper 2, we were presented with graphs showing the effect of using more powerful coolers on performance: approximately 10% improvement in test results with an increase in heat dissipation potential. Use the system in a room with a low ambient temperature, and AMD will give a 16% increase in performance compared to the drain system.

However, the opposite is also true. Having a piece of plastic where good heat transfer was supposed to raise the frequency and voltage, we got a significant decrease in performance.

Plastic capacity:

So, despite being used in a well-conditioned hotel room, this extra plastic had a decisive influence on most of our tests. Here is the damage caused to them:

On all multithreaded tests, when the CPU is heavily loaded, there is a significant decrease in performance. Blender showed a 20% reduction in throughput, POV-Ray fell by 10%, for 3DPM losses were 19%. The PCMark results are not so significantly reduced, since it has many single-threaded tests, and in some tests we even saw a deviation in the other direction, for example, in WinRAR, which depends on DRAM. Other benchmarks not listed include our compilation test, where the plasticized system was only 1% slower, or Dolphin, which showed a difference of one second.

What have I learned?

Do not be an idiot. Assembling a test bench with new components, being very tired, can lead to repeated tests.

Conclusions: not all cores are born equal

Designing a processor is often a process of fine tuning. To get performance, the architect must balance the calculations with the bandwidth, and always have enough data "feeding the beast" - load processor cores. If the "beast" was left idle, it consumes energy without doing any work. Setting up the right combination of resources is a difficult task, and therefore leading processor manufacturers hire thousands of engineers to make the system work properly. And when the main design is ready, it produces a number of heirs.

Sometimes exotic products fall out of the stack. The new generation of AMD Ryzen Threadripper processors is the very exotic. It would seem that direct replacements of the components of the previous generation were released, similar to them, but with better latency and greater frequency. These components are already well known, and we get the expected increase in the usual way. And at this moment additional silicon, included in 2990WX, without direct memory access, throws the wrench into the adjusted mechanism.

2950X (left) and 2990WX (right)

When all cores are directly related to memory, for example, the 2950X, all cores are considered equal, and distributing the workload is a fairly simple task. With the release of new processors, we got the situation shown in the figure to the right. Now only some kernels are directly tied to memory, and the rest are not. To transfer data from one of the “distant” cores to the main memory, an additional “jump” is required, which adds latency. And when all cores request access, a congestion occurs.

In order to fully utilize the capabilities of this architecture, the workload must be memory-free. In such tasks as the calculation of particle motion, ray tracing, scene rendering and decompression, the full load of all 32 cores allows the processor to be the star of our tests and set new records.

In the two-faced Janus style, with other workloads that have historically depended on the number of cores, such as physics, transcoding, and compression, the two-module structure results in a significant loss of performance. As a result, here, apparently, there are no average results - either the workload shows excellent results on the new processor, or it is at the tail of our high-quality testing package.

Part of the problem is related to the power distribution of these very large processors. As shown on page 4, the more chiplets that are in the game, or the more Mesh, the more energy is supplied not to the cores, but to internal networks, such as uncore or Infinity Fabric. Comparing one link IF in 2950X with six in 2990WX, we found that IF now consumes 60-73% of the total power of the chip with low loads, and 25-40% at high.

In fact, at full load, a chip like the 2990WX uses only 60% of its power budget for the processor frequency. The EPYC 7601 consumed only 50% of the power budget under load due to additional memory channels. Be sure that after AMD and Intel finish the fight for the number of cores, the next target on their list will be interconnect.

But a side effect of the fact that the chip does not use all the power to power the cores, and also has a bimodal architecture, is that some workloads will not scale, and in some cases there is a regression.

Big Boss: 32-core AMD behemoth

There is no doubt that when AMD Ryzen Threadripper 2990WX gets the opportunity to work to the full, it will do it with pleasure. We were able to overclock the system to 4 GHz on all cores, simply by changing the BIOS settings, although AMD also supports Precision Boost Overdrive in Windows to squeeze more out of the chip. In this case, the power consumption when using half of the cores at a frequency of 4.0 GHz jumps up to 260 W, and a fully loaded CPU takes off up to 450-500 W and in some places exceeds 600 W. Users will need to make sure that their motherboard and power supply are ready to perform such a task.

This is the moment when I finally say whether we recommend buying new AMD products. The ability to put 2950X instead of 1950X in your slots, also at a lower price, seems very attractive to us. However, the 2950X is already a niche product for high performance - and the 2990WX picks up this baton and is carried away, making the most powerful processor a niche niche. To be honest, not in all cases its performance is as great as one would expect, and its application makes sense for a narrow set of workloads where it turns out to be incomparable. And although it surpasses almost all other processors in our compilation test, there is one processor that beat it: 2950X.

For most users, 2950X is sufficient. For a select few, the 2990WX will prove to be the best processor in the world.

Thank you for staying with us. Do you like our articles? Want to see more interesting materials? Support us by placing an order or recommending to friends, 30% discount for Habr users on a unique analogue of the entry-level servers that we invented for you: The whole truth about VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps from $ 20 or how to share the server? (Options are available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).

3 months for free if you pay for new Dell R630 for half a year - 2 x Intel Deca-Core Xeon E5-2630 v4 / 128GB DDR4 / 4x1TB HDD or 2x240GB SSD / 1Gbps 10 TB - from $ 99.33 a month , only until the end of August, order can be here .

Dell R730xd 2 times cheaper?Only we have 2 x Intel Dodeca-Core Xeon E5-2650v4 128GB DDR4 6x480GB SSD 1Gbps 100 TV from $ 249 in the Netherlands and the USA! Read about How to build an infrastructure building. class c using servers Dell R730xd E5-2650 v4 worth 9000 euros for a penny?

Also popular now: