Monsters after the holidays: AMD Threadripper 2990WX 32-Core and 2950X 16-Core (part 4)

Original author: Ian Cutress
  • Transfer
Part 1Part 2Part 3Part 4 → Part 5

Power Consumption, TDP and Prime95 vs. POV-Ray

For most of us, processor power is about 15 watts on laptops and 65-95 watts on desktop systems. High-performance desktop processors have always been more voracious, and therefore TDP of 130 W and 140 W for them is a normal figure. When AMD released a 220-watt processor on the old Vishera platform, overclocking the Bulldozer family to 5.0 GHz, the thought crept in on whether AMD was completely crazy: many motherboards were compatible with an AMD socket, but to use a TDP of 220 W and above I had to release a number of new motherboards. Today, the most powerful Intel processor on the market has an official TDP of 205 watts, but AMD went further, raising the bar to 250 watts.

Two new WX processors, 32-core 2990WX and 2970WX, are rated for 250 watts. In both processors, all four silicone matrixes are active, there are six active Infinity Fabric lines. These processors are designed to reach a new level of performance, while AMD demonstrates slides with a turbo frequency on all 3.6 GHz cores. Two processors, which came to replace the X-series, have a power of 180 W, as well as the first-generation Threadripper processors.

However, not all TDP are equal. The ways in which Intel and AMD measure TDP have changed over the years, and now have become very far from reality. Let me explain.

TDP is such a joke

The TDP, or thermal design power, is not an indication of energy consumption. Technically, this is an indicator of the performance of the cooler, which means that to cope with its work, the cooler must have the same TDP level. The actual power consumption should be somewhat higher - thermal transfer from the processor to the socket and from the socket to the motherboard helps to cool, but is not included in the TDP. Often, TDP's heat dissipation rate and processor power consumption are perceived as one thing, because their differences are insignificant.

Let's start the calculation with AMD processors. AMD TDP calculation is based on a simple formula:

TDP = (Operating temperature, in Celsius - idle temperature, in Celsius) / Thermal power of the cooler

Thus, when AMD determines the TDP of its Ryzen 7 2700X processor with a load temperature of about 62 ° C, an idle temperature of 42 ° C and a cooler with a thermal capacity of 0.189 C per watt (Wraith Max), we get a value of about 105W.

The AMD formula has two problems at once: first, the temperature of the loaded processor can be adjusted using a cooler or external air flow, and secondly, the result is strongly influenced by the heat capacity of the cooler. With a large liquid cooler, which has a higher thermal power, for example, 0.400 C per watt, the nominal TDP of any processor will be lower: in the case of the Ryzen 7 2700X, its TDP will be only 50 watts. The TDP rating and power consumption are not equal, and their ratio can change in any direction, as soon as AMD chooses another cooler for comparisons.

Intel's TDP version is a bit more complicated, but does this indicator make sense ... Intel only determines the TDP of its processors for the base frequency, ignoring turbo frequencies. As a result, if Intel releases a processor with a 95 W TDP, a base frequency of 3.2 GHz, a single core turbo 4.7 GHz and a full turbo 4.2 GHz, then the guaranteed power consumption of 95 W will be at a base frequency of 3.2 GHz. So, on any motherboard that uses a turbo (that is, in general, on any) the processor at any load will consume more power than its official TDP.

And this is very annoying. Intel's marketing move is to advertise a single-core turbo of its processors and not publish lower values ​​of an “all-core” turbo. We are told that this is “inside information of a company” that falls under a non-disclosure agreement. In any case, each processor that has an “all-core” turbo frequency above the base frequency will consume above the specified TDP.

A good example is the Core i7-8700 and its 65 W TDP. It has a base frequency of 3.2 GHz, a single-core turbo 4.6 GHz and a full turbo 4.3 GHz. If we load more streams and limit energy consumption to 65 W, we get the following:

Is it really worth taking TDP values ​​seriously? Treat them with humor.

Power consumption

There are several ways to measure the power consumption of a processor. The easiest way is to use a measuring device, which will let you know the power consumption of the entire system, including the losses in the power supply system of the motherboard. The complex method involves connecting the necessary tools to the board to measure current through a 12-volt connector, measuring the voltage of the processor using the overclocking settings on some motherboards. The third way is to read the hardware registers using the appropriate software.

Reading registers is a double-edged sword. First, you rely on internal measurements, which often have a fairly wide margin of error. Secondly, you rely on the processor manufacturer, who must provide accurate data about your processor. This is not always reasonable (!). A positive point: it is possible to get more information from the processor, for example, power analysis for each core, DRAM power, IO / Interconnect power, integrated graphics power, to gain a general understanding of power distribution.

Hardware registers are the way in which the system reports to itself the data about the work: how much energy it uses, how it should regulate the voltage / frequency depending on current, power or thermal performance. Another positive side is the ease of using such data in test scripts.

Energy testing is often the subject of controversy. Typically, a specialized virus is used that can simultaneously load each area of ​​the processor at maximum power. "Power virus" is used to check the stability of the acceleration, but it has one drawback: with daily load, the results, as a rule, do not reflect the actual power consumption. This is a fine line between real testing and synthetic dough designed to drive every joule of energy through the chip. Software such as LINPACK is often used as an effective power test. Intel and AMD internal tools can help load the chip even more.

Prime95 is a popular tool, it is perfectly optimized for almost every core, it manages power supply. His workload is semi-synthetic, based on the calculation of prime numbers, but the stress test ignores the results and focuses only on energy consumption. During this review, we played a little with POV-Ray as a power test: it provides power consumption even higher than Prime95, and also uses the actual demand for ray tracing. Write a review to me, because I decide which tool is best to use for power consumption tests. Prime95 has problems when working with a large number of cores (it is sometimes difficult to get a test result when the 25 threads are exceeded), and in order to get POV-Ray to work, we have to adjust its methods of loading, After all, it aims more at checking kernel load, not threads. However, we expect to receive results depending on the number of threads. It will indicate which software was used at each stage of testing (we were able to prepare our version of POV-Ray only by the middle of the review, so most of the data was obtained from Prime95).

Total power consumption

As a first set of results, I want to present the total power consumption of the processor, measured in various situations. At idle:

Then we load only one core with two threads using Prime95. Our testing methodology makes both threads run on the same core, in case the processor cores are capable of processing multiple threads. Users who are focused on single-task loads will see power consumption in this particular range. This also applies to systems in which Windows is constantly in the background.

The third test is a system loaded with four threads using Prime95. This is exactly the range of workloads that most people use every day in their systems: several browser tabs, a couple of windows, several software packages are running, one or several games are running.

By increasing the processor load up to twelve threads (with Prime95), we turn to users using large and multi-tasking workloads. These are gamers - streamers, or users who start rendering, while working in parallel with other tasks.

The final graph shows the total power consumption. For this test, we run the maximum number of streams (Prime95), in the future we plan to use POV-Ray for this test, because it shows itself much better with a high number of streams. The only drawback of this test is that the overclocked 2990WX can complete the POV-Ray test in less than 20 seconds.

Power consumption of a single core

Before creating the POV-Ray power consumption test, I launched both the new Threadripper processors on the Prime95 test in the All-thread version, I got the power consumption of each core at each load.

When loading the first core, we see that its power consumption is ~ 23 watts. This is a lot compared to Zeppelin cores. This also applies to the case when two cores are loaded. Having loaded three cores, we observe a decrease in consumption to 18.8 W per core. Considering that this chip has four CCXs, the question arises whether this result is related to the fact that the threads are loaded into the same CCX (which, apparently, should happen), and we reach the power limit of CCX. When loading four cores, the consumption of each core is about 17.4 watts.

Raising the number of loaded cores to five, we find that the fifth core works at 18.2 watts, and the remaining four - at 16.8 watts. The result indicates that this fifth core is located on the new CCX. In the transition from eight cores to nine, we see the same thing: the ninth core consumes 17.5 watts of power, while the other eight draw about 14.3 watts. At the end of the distribution of power drops to 7-9 watts per core, if we use all 16 cores.

The total power consumption of the processor is ~ 178 W, about 180 W of TDP with a consumption of ~ 135 W on the cores, and the rest on Uncore (off-core hardware - Infinity Fabric, IO, IMC).

As for the test results of 2990WX, the resulting picture looks very, very strange.

For the most part, power consumption data for up to 15 cores is about the same as that of the 2950X. However, as the flow increases, it becomes clear that the first layer of the matrix is ​​clearly preferable. When loading additional streams and connecting the second matrix, the power at its cores turns out to be much lower - up to 2.4 W per core. The first layer of zeppelin at full load consumes about 6.6 watts per core, but the rest of the processor cores are about 2.4 watts. Something happens, as a result of which the first matrix gets priority on nutrition in comparison with the others. It should be noted that the power consumption of the chip is about 180 watts, not 250 watts, as its TDP shows.

Around this time, we finished writing the POV-Ray Power Test Script. I tried it on 2990WX, I give the results. And now they are much higher than expected:

Surprisingly, as the number of threads increased, the load became very evenly distributed. We were even able to fully use all of the 250 watts TDP with stock options and with a good cooler. Having loaded the processes completely, they saw the consumption of 193 W cores, 55 W by the other components. Under no circumstances did we observe “sagging” of active nuclei below 3 W. When all the cores were loaded, each core consumed its “comfortable” 6 watts. We have reached a processor power of 240 -250 W with a load of about 40 threads. With a further increase in flows, the added core caused a redistribution of power.

Two ideas came to mind. The first was easy to check: maybe the BIOS was “stuck” at a power consumption of 180 watts after installing 2950X? I rechecked, and before running the tests with 2990WX, I tested the already tried before 1920X. A complete BIOS reset did not affect the results. I can argue that this is not a power limit on the part of the BIOS. The second idea is to check the frequencies. Having checked only one point of reference (40 streams are loaded), we found a small scatter, but only in power.

During the Prime95 test, the first matrix worked at 7 W per core at a frequency of 3575 MHz. The second silicon crystal gave a result of 3 W per core at a frequency of 3525 MHz. Other (idle) cores ran at 1,775 MHz or 2000 MHz, consuming milliwatts.

During the POV-Ray test, each active core consumed about 9.1 watts per core and had a frequency of 3575 MHz. All idle cores were at 2000 MHz (there were three more at 1775 MHz), consuming milliwatts per core.

In addition to data on the consumption of cores, the chips looked generally the same in frequency. The POV-Ray test results are slightly higher, which means a higher overall energy consumption with POV-Ray.

Ultimately, it all comes down to the fact that the Prime95 power test, after exceeding the threshold of 20 cores or so, or on chips with several crystals, does not work as expected. In the future, we will use our test POV-Ray, which is able to squeeze more out of modern multi-core processors.

Core Cons vs. Non-Core

Returning to the point where we talked about the frequency of Infinity Fabric, you can see the ratio of power consumption in the POV-Ray test for 2990WX.

Although we observe some deviations from the previous result, the data (in addition to peak consumption) generally correspond to our Uncore power test with Prime95. Infinity Fabric still shows 55-60 watts. As a result, out-of-nuclear consumption as a percentage of total power starts from 75% with two streams, reaching 22% by the time of launching 40 streams.

Overclocking: 4.0 GHz per 500 W

Who said that a 250 watt processor should not be overclocked? AMD prides itself on manufacturing processors, each sold with an unlocked multiplier, and also uses soldered material as the thermal interface.

It is time to repent. We did not have enough time for overclocking. This processor has a base frequency of 3.0 GHz, turbo 4.2 GHz. In a conditioned room using a 500 watt cooler Enermax Liqtech, loading all cores under POV-Ray, each core worked at a frequency of 3150 MHz, and this is very far from the turbo frequency. The first thing I did was install a full-core turbo at 4.2 GHz, as well as a single-core turbo. This gave a good increase.

Nevertheless, the next stage of my experiments with overclocking surprised me. I set the CPU multiplier to 40-x in the BIOS for 4.0 GHz on all cores, all the time. I did not adjust the voltage, leaving the auto-mode. To do this, I even had to abandon the motherboard ASUS. Listen, the processor has flawlessly executed our 4.0 GHz test packet. I was shocked.

All I did for this overclocking is switching from “auto” to “40”. POV-Ray tests, which consume more energy, have worked successfully. Each test from the set worked. Although the thermal performance was high (at maximum load), the cooler easily coped with this.

At full load in the POV-Ray test, the processor showed a consumption of 500 watts, the cooler is rated for 500 watts. At some point, we saw a jump up to 511 W, where 440 W was reserved for cores (or 13.8 W per core) and 63 W for uncore (IF, IO, IMC), which corresponds to 12.5% ​​of total consumption energy. If you want the intercooler to spend less power consumption, overclock the percent!

We set the frequency to 4.1 GHz, and it seemed to work as well, until we loaded the system completely. As mentioned above, at 4.2 GHz, it was not possible to obtain a working result, even with an increase in voltage. For those who want to delve into overclocking, liquid cooling can be the solution.

Performance at 4.0 GHz

So, if the frequency of all cores is 3125 MHz, then overclocking to 4000 MHz should give a 28 percent increase in performance, right? Here are the results of some key tests of our package.

Overclocking 2990WX gave mixed results. It worked really well in some tests, still lagging behind the 2950X in others due to the two-module architecture.

In these tests, overclocking yielded a really good result: Blender shows an increase in throughput by 19%, POV-Ray - by 19%, 3DPM - by 19%. In other tests, it is inferior to 2950X (Photoscan), still lags behind (application download, WinRAR).

Overclocking will not fix all performance problems on the 2990WX, but it will certainly benefit the processor.

Thank you for staying with us. Do you like our articles? Want to see more interesting materials? Support us by placing an order or recommending to friends, 30% discount for Habr's users on a unique analogue of the entry-level servers that we invented for you: The whole truth about VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps from $ 20 or how to share the server? (Options are available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).

3 months for free if you pay for new Dell R630 for a period of half a year - 2 x Intel Deca-Core Xeon E5-2630 v4 / 128GB DDR4 / 4x1TB HDD or 2x240GB SSD / 1Gbps 10 TB - from $ 99.33 a month , only until the end of August, order can be here .

Dell R730xd 2 times cheaper?Only we have 2 x Intel Dodeca-Core Xeon E5-2650v4 128GB DDR4 6x480GB SSD 1Gbps 100 TV from $ 249 in the Netherlands and the USA! Read about How to build an infrastructure building. class c using servers Dell R730xd E5-2650 v4 worth 9000 euros for a penny?

Also popular now: