Backdoors in x86 assembly code microcode

    We do not trust the software for a long time, and therefore we carry out its audit, we carry out reverse engineering, we run it step by step, we run it in a sandbox. What about the processor on which our software runs? “We blindly and wholeheartedly trust this little piece of silicon.” However, modern hardware has the same problems as software: secret undocumented functionality, errors, vulnerabilities, malware, trojans, rootkits, backdoors.



    ISA (Instruction Set Architecture) x86 is one of the longest continuously changing “instruction set architectures” in history. Beginning with the 8086 design, developed in 1976, ISA undergoes constant changes and updates; while maintaining backward compatibility and support of the original specification. Over 40 years of his growing up, the ISA architecture has acquired and continues to be overgrown with a multitude of new modes and instruction sets, each of which adds a new layer to the previous design, which is already overloaded. Due to the policy of complete backward compatibility, in modern x86 processors there are even those instructions and modes that are now completely forgotten. As a result, we have a processor architecture, which is a complex intertwining maze of new and antique technologies. Such an extremely complex environment - creates many problems with the cyber security of the processor. Therefore, x86 processors cannot claim to be the trusted root of critical cyber infrastructure.


    Do you still trust your processor?


    The security of programs and the operating system is based on the security of the hardware on which they are deployed. As a rule, software developers do not take into account the fact that the hardware on which their software is deployed may be untrusted, harmful. When iron behaves erroneously (regardless of whether it is intentional or not), software security mechanisms are completely depreciated. For many years, various models of protected processors have been proposed: Intel SGX, AMD Pacifica, and others. Nevertheless, the enviable regularity with which information about critical failures (from recent ones, such as Meltdown and Specter) and the discovered undocumented debugging functions — leads to the idea that our selfless trust in processors is unfounded.


    Modern x86 processors are a very cumbersome and intricate interweaving of the latest and antique technologies. The 8086 had 29 thousand transistors, the Pentium had 3 million, the Broadwell had 3.2 billion, and the Cannonlake had more than 10 billion.



    With so many transistors, it's no wonder that modern x86 processors are speckled with undocumented instructions and hardware vulnerabilities. Among the undocumented, found almost by chance, are the instructions: ICEBP (0xF1), LOADALL (0x0F07), apicall (0x0FFFF0) [1], which allow you to unlock the processor for unauthorized access to protected memory areas.


    As for the numerous hardware vulnerabilities of processors (see two figures below), they allow the cyber attacker to escalate privileges [3], extract cryptographic keys [4], create new assembler instructions [2], change the functionality of already existing assembly instructions [2] , install hooks on assembler instructions [2], take control of "hardware accelerated virtualization" [7], interfere with "atomic cryptographic calculations" [7] and, - finally, sweet, - enter "god mode": ive yourself a legitimate hardware backdoor Intel ME (which allows you to get remote access even the computer turned off). [8] And all this - without leaving any digital traces.




    Modern processors are more software than hardware.


    Strictly speaking, modern processors can not even be called "iron" in the full sense of the word. Because their most critical functionality (including ISA) is provided by a flashing microcode. Initially, the microcode was mainly responsible for managing the decoding and execution of complex assembler instructions: floating point operations, MMX primitives, string handlers with a REP prefix, and so on. However, over time, the microcode is assigned more and more responsibility for processing operations inside the processor. For example, modern extensions of Intel processors, such as AVX (Advanced Vector Extensions) and VT-d (hardware virtualization), are implemented in microcode.


    In addition, the microcode today is responsible, among other things, for maintaining the state of the processor, for managing the cache, and also for managing energy savings. To save energy, the microcode implements an interrupt mechanism that processes the power supply states: C-states (degree of energy-saving sleep: from active state to deep sleep) and P-states (different combinations of voltage and frequency). So, for example, for resetting the L2 cache when entering the C4 state, as well as when exiting it, the microcode is responsible.


    Why do processor manufacturers use microcode?


    Manufacturers of x86 processors use microcode to decompose complex assembler instructions (the length of which can be up to 15 bytes) into a chain of simple microinstructions, in order to simplify the hardware architecture and facilitate diagnostics. In fact, the microcode is an interpreter between the external (visible to the user) CISC architecture and the internal (hardware) RISC architecture.


    When errors are detected in the CISC architecture (in ISA, first of all), manufacturers publish microcode updates that can be downloaded to the processor via the motherboard BIOS / UEFI or via the operating system (during the boot process). Thanks to such a microcode-based update system, processor manufacturers provide themselves with flexibility and cost reduction — while correcting errors in their hardware. A sensational bug with fdiv, which severely hampered Intel Pentium processors in 1994, made it even more obvious that high-tech hardware is error prone, just like software. This incident has generated from manufacturers even more interest in the microcode-based processor architecture. Therefore, Intel and AMD began to build their processors using microcode technology. Intel - starting with the Pentium Pro (P6), released in 1995. AMD - starting with K7, released in 1999.


    All secret becomes clear


    Despite the fact that processor manufacturers are trying to keep the microcode architecture and the mechanism for updating them in the strictest secrecy, the enemy does not sleep. Scraps of scattered information (mainly from patents like AMD RISC86 [5]) and thoughtful reversal of official BIOS updates (as was the case with K8 [6]) gradually shed light on the secret of the microcode (see, for example, in the figure below ” AMD processor microcode update mechanism ”). And thanks to the constant evolution of reversing tools (both software and hardware) [2], promising fuzzing techniques [1] and the emergence of such open source tools like Microparse [9] and Sandsifter [10] - a cybercriminator can learn everything about microcode write microcode code on it.



    For example, in [2], “Hello world!” (The first step to trojanization of microcode) was developed as “microhook” (microcode program intercepting assembler instructions), which counts how many times the processor addressed the div command. This microhook is an injection into the handler assembly instruction div.



    In the same place [2], a more advanced microchip is presented, which sits quietly in the assembler instruction ebx div, without giving out its presence, and is activated only when specific conditions are fulfilled when the ebx div is addressed: the value B contains the ebx register, and eax contains the value A. By activating, this microhuk increases the value of the eip register (pointer of the current instruction) by one. As a result, execution of the program (which had the courage to refer to the div ebx instruction) continues with an offset: not from the first byte following the div ebx command, but from its second byte. If other values ​​are given in the eax and ebx registers, then the ebx div works as usual. What is the practical value of this? For example, in order to invisibly activate a hidden chain of assembler instructions,



    These two examples demonstrate how legitimate assembler instructions can be used to hide an arbitrary Trojan code in them.


    At the same time, a cyber attacker can activate its malicious payload - including remotely. For example, when the condition necessary for activation is executed on a web page controlled by an attacker. This is possible thanks to the Just-in-Time (JIT) and Ahead-of-Time (AOT) compilers built into modern web browsers. These compilers allow you to emit a predefined stream of machine-code assembly instructions — even if you are writing a program exclusively on high-level JavaScript (see the last listing — just above).


    Bibliography
    1. Christopher Domas . Breaking the x86 ISA // DEFCON 25. 2017.
    2. Philipp Koppe . Reverse Engineering x86 Processor Microcode // Proceedings of the 26th USENIX Security Symposium. 2017. pp. 1163-1180.
    3. Matthew Hicks. SPECS: A Lightweight Runtime Mechanism for Protecting Software from Security-Critical Processor Bugs // Proceedings of the 28th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 2015. pp. 517–529.
    4. Adi Shamir. Bug Attacks // Proceedings of the 28th Annual conference on Cryptography: Advances in Cryptology. 2008. pp. 221–240.
    5. John Favor. RISC86 Instruction Set // US Patent 6336178. 2002.
    6. Opteron Exposed: Reverse Engineering AMD K8 Microcode Updates. 2004.
    7. Saming Chen. Security Analysis of x86 Processor Microcode.2014.
    8. Catalin Cimpanu. Malware Uses Obscure Intel CPU Feature to Steal Data and Avoid Firewalls. 2017.
    9. Daming Chen. Microparse: Microcode parser for AMD, Intel, and VIA processors // GitHub. 2014.
    10. Sandsifter: The x86 processor fuzzer // GitHib. 2017.
    11. Карев В.М.Как написать на ассемблере программу с перекрываемыми инструкциями (ещё одна техника обфускации байт-кода) // Хабрахабр. 2018. URL: (дата обращения: 25 октября 2018).

    Also popular now: