Reducing the operating voltage of the processor, or tuning Enhanced Intel SpeedStep

    Modern desktop and (in particular) mobile processors use a number of energy-saving technologies: ODCM, CxE, EIST, etc. Today, we will probably be interested in the highest level of them: flexible control of the frequency and voltage of the processor core during operation - Cool 'n 'Quiet, PowerNow! AMD and Enhanced SpeedStep (EIST) Intel.

    Most often, it is enough for the user of a computer or laptop to simply enable (check) the support of a particular technology in the BIOS and / or operating system - no fine-tuning is usually provided, although, as practice shows, it can be very useful. In this article I will talk about how you can control the operating voltage of the processor core from the operating system (using Intel Pentium M and FreeBSD as an example), and why this might be necessary.

    Despite the large number of manuals, it is rare to find a detailed description of Enhanced SpeedStep technology from the point of view of the operating system (and not the end user), especially in Russian, so a significant part of the article is devoted to implementation details and is somewhat theoretical in nature.

    I hope this article will be useful not only to FreeBSD users: we will also touch on GNU / Linux, Windows, and Mac OS X. However, in this case, the specific operating system is of secondary importance.

    Foreword


    Last year I upgraded the processor in my old laptop: I installed the Pentium M 780 instead of the standard 735, I reached the maximum, so to speak. The laptop began to bask more under load (due to increased heat generation by 10 watts); I didn’t pay much attention to this (except that I cleaned and oiled the cooler just in case), but one fine day, during a long compilation the computer ... just turned off (the temperature reached critical hundred degrees). I deduced the value of the system variablehw.acpi.thermal.tz0.temperatureto tray to observe the temperature and, if anything, to interrupt the "difficult" task in time. But after some time I lost my vigilance (the temperature always remained within the normal range), and everything repeated. At this moment, I decided that I no longer want to constantly fear a blackout during a long CPU load and keep my hand on Ctrl-C, nor force the processor.

    Typically, a change in the nominal voltage implies an increase in the voltage in order to ensure stable processor operation during overclocking (i.e., at an increased frequency). Roughly speaking, each voltage value corresponds to a certain frequency range at which it can operate, and the task of overclocker is to find the maximum frequency at which the processor is not yet “buggy”. In our case, the task is, in a sense, symmetrical: for a known frequency (more precisely, as we will soon find out a set of frequencies), find the lowest voltage that ensures stable operation of the CPU. I don’t want to lower the working frequency so as not to lose in performance - the laptop is already far from top-end. In addition, lowering the voltage is more profitable .

    Bit of theory


    As you know, the heat dissipation of a processor is proportional to its capacity, frequency, and voltage squared (anyone interested in why this is so, they can try to derive the dependence on their own, considering the processor as a set of elementary CMOS inverters (logical negatives), or follow the links: one , two , three )

    Modern mobile processors can consume up to 50-70 watts, which ultimately dissipate into heat. This is a lot (think of incandescent bulbs), especially for a laptop that will “eat” a battery like that oranges pig under load. In conditions of limited space, heat will most likely have to be actively removed, which means additional energy consumption for the rotation of the cooler fan (possibly several).

    Naturally, this state of affairs did not suit anyone, and processor manufacturers began to think how to optimize power consumption (and, accordingly, heat dissipation), and at the same time prevent the processor from overheating. For those interested, I recommend reading a number of wonderful articles Dmitry Besedin, and in the meantime I will go directly to business.

    A bit of history


    For the first time, SpeedStep technology (version 1.1) appeared in the second generation of the third pentiums (Coppermine mobile laptops manufactured using the .18 micron process technology, 2000), which, depending on the load or power source of the computer — mains or battery — could switch between high and low frequencies due to a variable multiplier. In economy mode, the processor consumed about half the power.

    With the transition to the .13 micron technical process, the technology receives version number 2.1 and becomes “enhanced” - now the processor can reduce not only the frequency, but also the voltage. Version 2.2 is an adaptation for the NetBurst architecture, and to the third version (Centrino platform), the technology will officially be called Enhanced Intel SpeedStep (EIST).

    Version 3.1 (2003) was first used in the first and second generations of Pentium M processors (Banias and Dothan cores). The frequency varied (at first - only switched between the two values) from 40% to 100% of the base, in increments of 100 MHz (for Banias) or 133 MHz (for Dothan, our case). At the same time, Intel is introducing dynamic L2 cache capacity management, which helps optimize power consumption even better. Version 3.2 (Enhanced EIST) is an adaptation for multi-core processors with a common L2 cache. (A small FAQ from Intel on SpeedStep technology.)

    Now, instead of blindly following the numerous howto and tutorials, download the pdf and try to understand the principle of EST (I will continue to use this abbreviation, because it is more universal and shorter).

    How does EST work?


    So, EST allows you to control the performance and power consumption of the processor, and dynamically , during its operation. Unlike earlier implementations, which required hardware support (in the chipset) to change the processor operating parameters, EST allows software , i.e. BIOS or operating system, change the multiplier (the ratio of the processor frequency to the bus frequency) and the core voltage (V cc ) depending on the load, type of computer power supply, CPU temperature and / or OS settings (policies).

    During operation, the processor is in one of several states (power states): T (throttle), S (sleep), C (idle), P (performance), switching between them according to certain rules (p. 386ACPI 5.0 specifications ).

    Processor power states


    Each processor present in the system should be described in the DSDT table, most often in the namespace \_PR, and usually provides a number of methods through which it interacts with the operating system (PM driver), and which describe the capabilities of the processor ( _PDC, _PPC), supported states ( _CST, _TSS, _PSS) and management ( _PTC, _PCT). The required values ​​for each CPU (if it is included in the so-called CPU support package) are determined by the BIOS of the motherboard, which fills in the corresponding tables and ACPI methods (p. 11 pdfs) when the machine boots.

    EST controls the processor in the P-state, they will be of interest to us. For example, Pentium M supports six P-states (see Figure 1.1 and Table 1.6 pdfs), which differ in voltage and frequency:

    Power vs.  Core Voltage for Intel Pentium M 1.6GHz


    In the general case, when the processor is not known in advance, the only more or less reliable (and recommended by Intel) method of working with it is ACPI. You can interact with a specific processor directly, bypassing ACPI, through MSR (Model-Specific Register) registers, including directly from the command line: starting from version 7.2, a utility is used for this in FreeBSD cpucontrol(8).

    To find out if your processor supports EST, you can look at the 16th bit in the register IA_32_MISC_ENABLE(0x1A0), it must be installed:

    # kldload cpuctl
    # cpucontrol -m 0x1a0 /dev/cpuctl0 | (read _ msr hi lo ; echo $((lo >> 16 & 1)))
    1

    A similar command for GNU / Linux (msr-tools package required):

    # modprobe msr
    # echo $((`rdmsr -c 0x1a0` >> 16 & 1))
    1

    The transition between states occurs when writing to the register IA32_PERF_CTL(0x199). You can find out the current operating mode by reading the register IA32_PERF_STATUS(0x198), which is updated dynamically (tab. 1.4 pdfs). In the future, IA32_I will omit the prefix for brevity.

    Let's try to read the current value first PERF_STATUS:

    # cpucontrol -m 0x198 /dev/cpuctl0
    MSR 0x198: 0x0612112b 0x06000c20

    From the documentation it follows that the current state is encoded in the lower 16 bits (if you execute the command several times, their value can change - this means that EST is working). If you look closely at the other bits, they are also clearly not garbage. Googling, you can find out what they mean.

    Register Structure PERF_STATUS


    The data read from PERF_STATUSis represented by the following structure (suppose the data is stored as little-endian):

    struct msr_perf_status {
    	unsigned curr_psv	: 16;		/* Current PSV */
    	unsigned status	: 8;		/* Status flags */
    	unsigned min_mult	: 8;		/* Minimum multiplier */
    	unsigned max_psv	: 16;		/* Maximum PSV */
    	unsigned init_psv	: 16;		/* Power-on PSV */
    };

    Three 16-bit fields are the so-called Performance State Values ​​(PSV), we will consider their structure below: the current value of PSV, the maximum (depending on the processor) and the value at the start of the system (when turned on). The current value (curr_psv), obviously, changes when the operating mode changes, the maximum (max_psv) usually remains constant, the starting value (init_psv) does not change: as a rule, it is equal to the maximum value for desktops and servers, but the minimum for mobile CPUs. The minimum factor (min_mult) for Intel processors is almost always six. The status field contains the value of some flags, for example, upon the occurrence of EST or THERM events (i.e., at the time of a P-state change or processor overheating, respectively).

    Now that we know the purpose of all 64 bits of the registerPERF_STATUS, we can decrypt the word read above: 0x0612112b 0x06000c20 ⇒ PSV at start 0x0612, the maximum value is 0x112b, the minimum factor is 6 (as expected), the flags are reset, the current value is PSV = 0x0c20. What exactly do these 16 bits mean?

    Performance State Value (PSV) Structure


    It is very important to know and understand what PSV is like, because it is in this form that the processor operating modes are set.

    struct psv {
    	unsigned vid 	: 6;	/* Voltage Identifier */
    	unsigned _reserved1	: 2;
    	unsigned freq	: 5;	/* Frequency Identifier */
    	unsigned _reserved2	: 1;
    	unsigned nibr	: 1;	/* Non-integer bus ratio */
    	unsigned slfm	: 1;	/* Dynamic FSB frequency (Super-LFM) */
    };

    Dynamic FSB frequency switching indicates to skip every second FSB cycle, i.e. halve the operating frequency; this feature was first implemented in Core 2 Duo processors (Merom core) and we are not affected, as is the Non-integer bus ratio - a special mode supported by some processors, which allows, as the name implies, to more finely control their frequency.

    Two fields are related to the EST technology itself - Frequency Identifier (Fid), which is numerically equal to the multiplier, and Voltage (Identifier, Vid), which corresponds to the voltage level (it is usually the least documented).

    Voltage Identifier


    Intel is very reluctant to disclose information (usually required to sign an NDA) about how exactly the voltage identifier is encoded for each processor. But for most popular CPUs, fortunately, this formula is known; in particular, for our Pentium M (and many others): V cc = Vid 0 + (Vid × V step ), where V cc is the current (actual) voltage, Vid 0 is the base voltage (when Vid == 0), V step - step. Table for some popular processors (all values ​​in millivolts):
    CPUVid 0V stepV bootV minV max
    Pentium M700,016.0xxxx, xxxx, xxxxx, x
    E6000, E4000825.012.51,100.0850,01,500.0
    E8000, E7000825.012.51,100.0850,01362.5
    X9000712.512.51,200.0800,01325.0
    T9000712.512.51,200.0750.01300.0
    P9000, P8000712.512.51,200.0750.01300.0
    Q9000D, Q8000D825.012.51,100.0850,01362.5
    Q9000M712.512.51,200.0850,01300.0
    The multiplier (i.e., Fid) is recorded in the PSV shifted 8 bits to the left, the lower six bits are occupied by Vid. Because in our case, the remaining bits can be neglected, then the PSV, processor frequency, system bus and physical voltage are connected by a simple formula (for Pentium M):

    PSV = (frequency / bus clock) * 256 + (Vcc - 700) / 16

    Now consider the control register ( PERF_CTL). Writing to it should be done as follows: first, the current value is read (the entire 64-bit word), the necessary bits are changed in it, and written back to the register (the so-called read-modify-write).

    Register Structure PERF_CTL


    struct msr_perf_ctl {
    	unsigned psv	: 16;	/* Requested PSV */
    	unsigned _reserved1	: 16;
    	unsigned ida_diseng	: 1;	/* IDA disengage */
    	unsigned _reserved2	: 31;
    };

    IDA (Intel Dynamic Acceleration) disengage-bit allows you to temporarily disable adaptive (opportunistic) frequency control on Intel Core 2 Duo T7700 processors and later - again, we are not interested. The lower 16 bits (PSV) is the mode in which we “ask” the processor to switch.

    _PSS table


    A table _PSSis an array of states ( Package in ACPI terminology) or a method that returns such an array; each state (P-state) is in turn determined by the following structure (p. 409 of the ACPI specification):

    struct Pstate {
    	unsigned CoreFrequency;	/* Core CPU operating frequency, MHz */
    	unsigned Power;		/* Maximum power dissipation, mW */
    	unsigned Latency;		/* Worst-case latency of CPU unavailability during transition, µs */
    	unsigned BusMasterLatency;	/* Worst-case latency while Bus Masters are unable to access memory, µs */
    	unsigned Control;		/* Value to be written to the PERF_CTL to switch to this state */
    	unsigned Status;		/* Value (should be equal to the one read from PERF_STATUS) */
    };

    Thus, each P-state is characterized by some kind of working core frequency, maximum power dissipation, transit delays (in fact, this is the transition time between states during which the CPU and memory are unavailable), finally, the most interesting: PSV, which corresponds to this state and which must be written to in PERF_CTLorder to go into this state (Control). To verify that the processor has successfully transitioned to a new state, you need to read the register PERF_STATUSand compare with the value recorded in the Status field.

    The EST driver of the operating system may “know” about some processors, i.e. will be able to manage them without ACPI support. But this is rare, especially these days (although for undervolting on Linux, somewhere before version 2.6.20, you had to patch the tables in the driver, and back in 2011 this method wasvery common ).

    It is worth noting that the EST driver can work even in the absence of a table _PSSand an unknown processor, because the maximum and minimum values ​​can be found from PERF_STATUS(in this case, obviously, the number of P-states degenerates into two).

    Enough theory. What to do with all this?


    Now that we know 1) the purpose of all the bits in the right words MSR, 2) how exactly the PSV is encoded for our processor, and 3) where in DSDT to look for the necessary settings, it's time to compile a table of frequencies and voltages by default . Dump the DSDT and look for a table there _PSS. For Pentium M 780, it should look something like this:

    Default _PSS values
        Name (_PSS, Package (0x06) {	// Всего определено 6 состояний (P-states)
            Package (0x06) {
                0x000008DB,			// 2267 MHz (cf. Fid × FSB clock)
                0x00006978,			// 27000 mW
                0x0000000A,			// 10 µs (соответствует спецификации)
                0x0000000A,			// 10 µs
                0x0000112B,			// 0x11 = 17 (множитель, Fid), 0x2b = 43 (Vid)
                0x0000112B
            },
            Package (0x06) {
                0x0000074B,			// 1867 MHz (82% от максимальной)
                0x000059D8,			// 23000 mW
                0x0000000A,
                0x0000000A,
                0x00000E25,			// Fid = 14, Vid = 37
                0x00000E25
            },
            Package (0x06) {
                0x00000640,			// 1600 MHz (71% от максимальной)
                0x00005208,			// 21000 mW
                0x0000000A,
                0x0000000A,
                0x00000C20,			// Fid = 12, Vid = 32
                0x00000C20
            },
            Package (0x06) {
                0x00000535,			// 1333 MHz (59% от максимальной)
                0x00004650,			// 18000 mW
                0x0000000A,
                0x0000000A,
                0x00000A1C,			// Fid = 10, Vid = 28
                0x00000A1C
            }, 
            Package (0x06) {
                0x0000042B,			// 1067 MHz (47% от максимальной)
                0x00003E80,			// 16000 mW
                0x0000000A,
                0x0000000A,
                0x00000817,			// Fid = 8, Vid = 23
                0x00000817
            },
            Package (0x06) {
                0x00000320,			// 800 MHz (35% от максимальной)
                0x000032C8,			// 13000 mW
                0x0000000A,
                0x0000000A,
                0x00000612,			// Fid = 6, Vid = 18
                0x00000612
            }
        })

    So, we know the default Vid for each P-level: 43, 37, 32, 28, 23, 18, which corresponds to voltages from 1388 mV to 988 mV. The essence of undervolting is that for sure these voltages are slightly higher than what is really necessary for the stable operation of the processor. Let's try to determine the "boundaries of what is permitted."

    I wrote a simple shell script for this , which gradually lowers Vid and executes a simple loop (daemonpowerd(8)before this, of course, it is necessary to nail). Thus, I determined the voltages that allowed the processor to at least not hang, then I ran the Super Pi test and rebuilding the kernel several times; later, I raised the Vid value for the two maximum frequencies by one more point, otherwise gcc would occasionally crash due to an illegal instruction error. As a result of all the experiments over the course of several days, we obtained the following set of “stable” Vid: 30, 18, 12, 7, 2, 0.

    Results Analysis


    Now that we have empirically determined the minimum safe voltages, it is interesting to compare them with the initial ones:
    Frequency, MHz (multiplier)Vid oldVid newChange V cc
    2267 (17)43thirty-fifteen%
    1867 (14)3718-24%
    1600 (12)3212-26%
    1333 (10)287-29%
    1067 (8)232-31%
    800 (6)180-29%
    Lowering the maximum voltage even by 15% yielded quite tangible results: a long load not only does not lead to overheating of the processor and an emergency shutdown, the temperature now almost never exceeds 80 ° C. The projected battery life in the "office" mode, judging by acpiconf -i 0, has increased from 1 hour 40 meters to 2 hours 25 meters (Not so much, but lithium-ion cells "get tired" over time, and I don’t changed since the purchase of the laptop about seven years ago.)

    Now we need to make sure that the settings are applied automatically. You can, for example, modify the drivercpufreq(4)so that PSV values ​​are taken from their own table, not through ACPI. But this is already inconvenient even if you need to remember to patch the driver when updating the system, and indeed it looks more like a dirty hack than a solution. You can probably patch it somehow powerd(8), which is bad for about the same reasons. You can simply run the script by lowering the voltage by writing directly to the MSR (which, in fact, I did to determine the “stable” voltages), but then you have to remember about and independently process transitions between states (not just P-states, generally any, for example, when the laptop wakes up from sleep). It’s not the same.

    If we get the PSV values ​​through ACPI, then the most logical thing is to change the table_PSSin DSDT. Fortunately, you don’t have to pick a BIOS for this: FreeBSD can load DSDT from a file (we have written more than once about the modification of ACPI tables on Habré , so we won’t dwell on this now). Replace the required fields in the DSDT:

    Undervolting patch for _PSS
    @@ -7385,8 +7385,8 @@
                 0x00006978,
                 0x0000000A,
                 0x0000000A,
    -            0x0000112B,
    -            0x0000112B
    +            0x0000111D,
    +            0x0000111D
             },
             Package (0x06)
    @@ -7395,8 +7395,8 @@
                 0x000059D8,
                 0x0000000A,
                 0x0000000A,
    -            0x00000E25,
    -            0x00000E25
    +            0x00000E12,
    +            0x00000E12
             },
             Package (0x06)
    @@ -7405,8 +7405,8 @@
                 0x00005208,
                 0x0000000A,
                 0x0000000A,
    -            0x00000C20,
    -            0x00000C20
    +            0x00000C0C,
    +            0x00000C0C
             },
             Package (0x06)
    @@ -7415,8 +7415,8 @@
                 0x00004650,
                 0x0000000A,
                 0x0000000A,
    -            0x00000A1C,
    -            0x00000A1C
    +            0x00000A07,
    +            0x00000A07
             },
             Package (0x06)
    @@ -7425,8 +7425,8 @@
                 0x00003E80,
                 0x0000000A,
                 0x0000000A,
    -            0x00000817,
    -            0x00000817
    +            0x00000802,
    +            0x00000802
             },
             Package (0x06)
    @@ -7435,8 +7435,8 @@
                 0x000032C8,
                 0x0000000A,
                 0x0000000A,
    -            0x00000612,
    -            0x00000612
    +            0x00000600,
    +            0x00000600
             }
         })

    We compile a new AML file (ACPI bytecode) and modify it /boot/loader.confso that FreeBSD loads our modified DSDT instead of the default one:

    acpi_dsdt_load="YES"
    acpi_dsdt_name="/root/undervolt.aml"

    That, in general, is all. The only thing, do not forget to comment out these two lines in /boot/loader.conf, if you change the processor.

    Even if you are not going to lower the nominal voltage, the ability to configure control of the processor states (not only P-states) may come in handy. After all, it often happens that the BIOS curve “fills” the tables incorrectly, not completely, or does not fill them at all (for example, because there is celeron that does not support EST, and the manufacturer does not formally provide for its replacement). In this case, you will have to do all the work yourself . Note that adding a table alone _PSSmay not be enough; for example, C-states are defined by a table _CST, and in addition, it may be necessary to describe the control procedures themselves (Performance Control,_PCT) Fortunately, this is simple and fairly detailed, with examples described in the eighth chapter of the ACPI specification.

    Undervolting on GNU / Linux


    In truth, at first I thought it was enough for me to read the Gentoo Undervolting Guide and just adapt it for FreeBSD. This turned out to be not so simple, because the document turned out to be extremely stupid (which is actually strange for the Gentoo Wiki). Unfortunately, on their new site I did not find anything similar, I had to be content with the old copy; and although I understand that this guide has largely lost its relevance, I still criticize it a little. :-)

    For some reason, immediately, without declaring war, I was offered to patch the kernel (in FreeBSD, for a minute, we don’t have any system code at alldid not have to be modified). Hammer in the driver internals or write in some init scripts the values ​​of certain "safe" voltages, it is not clear by whom and how they were obtained from a special table (in which the Pentium M 780 is mockingly represented by a line consisting of only question marks). Follow the advice, among which are written by people who clearly do not understand what they are talking about. And most importantly, it is completely unclear why and how exactly these magical replacements of some numbers with others work; there is no suggested way to “touch” EST before patching and rebuilding the kernel, MSR registers and working with them from the command line are never mentioned. Modification of ACPI tables is not considered as an alternative and more preferred option.

    ThinkWiki has a slightly better guide(and newer), but not by much. The ArchWiki page looks even more succinct . This line delivers especially:

    # echo 34 26 18 12 8 5 > /sys/devices/system/cpu/cpu0/cpufreq/phc_vids

    So they ask for Lost “4, 8, 15, 16, 23, 42” (though in the reverse order, which somewhat spoils the joke).

    Perhaps the most sensible description of the whole process for Linux by Pat Erley, the link to which I gave above.

    Undervolting on Windows and Mac OS X


    There is no point in talking about Windows: there is software and discussions on the forums, so I’ll just leave a couple of links here .

    Makos interacts quite closely with (and expects correct operation) ACPI, and table modification is one of the main methods for setting it up for a specific hardware. Therefore, the first thing that comes to mind is to dump and patch your DSDT in the same way . Alternative method: google://IntelEnhancedSpeedStep.kextfor example, one , two , three .

    Another "wonderful" utility (fortunately, already outdated) offers to buy for $ 10 the ability to change voltage and frequency. :-)

    What else to read


    For FreeBSD: forum topic , as well as the notorious discussion in the mailing list; the original letter of Alexander Motin for the convenience of vikified . For Linux, you can start with a good article on ArchWiki.

    For those who want to delve into the topic, in addition to the official documentation of processor manufacturers and the links provided in the text, here is an excellent selection of materials (research articles, presentations) on a wide range of energy management issues (caution, Comic Sans).

    Also popular now: