Operation Heap Overflow Using JavaScript

Original author: Mark Daniel, Jake Honoroff, Charlie Miller
  • Transfer

From translator


In this study, the authors reveal an interesting technique for exploiting heap memory overflows. Of course, this vulnerability has been fixed long ago, but the technique itself is very interesting, and the overflow process is described in some detail.
If you are interested in information security and would like to understand how overflows occur, which now and then appear in the news bulletins, you will like the study.

Foreword


This article introduces a new method for exploiting heap
overflows in JavaScript interpreters. In short,
you can use JavaScript commands to get heap overflow to ensure that the function pointer is displayed reliably immediately after a buffer overflow. This tutorial uses the Safari technique that the authors used to win the CanSecWest 2008 Pwn2Own contest.

Introduction


Many buffer and integer overflow vulnerabilities allow you to write several arbitrary values ​​at a relative offset to the heap pointer. Unfortunately for the attacker, often the data following the pointer is unpredictable, which makes operation difficult and unreliable. With the perfect execution of heap overflows, the attacker gains full control over the number and values ​​of overflowed bytes, and this is often practically impossible if nothing interesting and predictable awaits rewriting.

Thanks to the secure release of the metadata structure, heaps are increasingly not an interesting target for overflow. Currently, overflow data needs such data for which normal execution of the program leads to a function pointer that has been overwritten with a pointer to the code of the attacker. However, such operations do not in any way guarantee reliability. For a successful overflow, pointers that are not yet available should be on the heap after the buffer overflow, and there should not be other critical data or unmarked memory between them, the destruction of which will lead to a premature failure. Such ideal conditions are certainly rare for any application.

However, given client-side scripting language access, such as JavaScript, an attacker could create these ideal conditions for vulnerabilities in applications such as web browsers. In the article (Sotirov, A. Heap Feng Shui in JavaScript. Blackhat Europe 2007), Sotirov describes how to use JavaScript distributions in Internet Explorer to allow an attacker to control the target heap. In this article, we describe a new technique inspired by his bunch of feng shui (Heap Feng Shui), which can be used for reliable positioning.

2 Technique


2.1 Context


In a broad sense, this method can be used to develop client exploits against web browsers that use JavaScript. In this context, an attacker creates a web page containing (among other things) JavaScript commands, and causes the victim to load the page in a browser. Using certain JavaScript commands, an attacker affects the heap state in the victim’s browser process in order to organize a successful attack.

In the future, we will keep in mind that a buffer that is full is a vulnerable buffer. To achieve heap overflow, the placement and overflow of the vulnerable buffer must be run in the JavaScript interpreter. In particular, this method will not be used in situations where a vulnerable buffer has already been allocated before the JavaScript interpreter was created.

We also assume that shellcode is available, and a mechanism for loading it into memory has already been found. This is trivial with JavaScript - just load it into a large line.

2.2 Overview


Keep in mind that the goal is to manage the buffer on the heap immediately after the vulnerable buffer. We achieve this by arranging a heap in which all the holes are large enough to hold the vulnerable buffer, and are surrounded by the buffers that we manage.

The technique consists of five stages.

1. Defragment the heap.
2. Create a hole in the heap.
3. Preparing blocks around the holes.
4. Isolation and overflow of the trigger.
5. Starting the transition to shellcode.
These steps are described in more detail in the rest of this section.

2.3 Defragmentation


The state of the process heap depends on the history of memory allocation and release that occurred during the life of the process. Therefore, the heap state for a continuous multi-threaded process (such as a web browser) is unpredictable. Typically, such a heap is fragmented by the fact that it has many openings in free memory. The presence of holes of different sizes of free memory means that the addresses of consecutive distributions of buffers of the same size are unlikely to have a reliable relationship. Figure 1 shows what the distribution might look like in a fragmented heap:


Figure 1. Fragmented heap.

Since it is impossible to predict where these holes can occur in a heap that has been used for some time, it is impossible to predict where the next distribution will occur.

However, if we have some control over the target application, we can force the application to make many distributions of any given size. In particular, we can make many distributions that essentially fill all the holes. As soon as the holes are filled, that is, the heap is defragmented, distributions of a similar size will usually be predictably close to each other, as in Figure 2:


Figure 2. Defragmented heap. Future appropriations turn out to be contiguous.

We emphasize that defragmentation always refers to a specific buffer size. In preparation for exploiting the vulnerability, we must defragment the heap with respect to the size of the vulnerable buffer. The size of this buffer will vary, but it must be known. As shown in Figure 2, setting up defragmentation is very easy in JavaScript.

For instance:

var bigdummy = new Array(1000); 
for(i=0; i<1000; i++){
      bigdummy[i] = new Array(size); 
}

In the code snippet above, each call to new Array (size) results in a 4 * size + 8 byte allocation in the heap. This distribution corresponds to an ArrayStorage object consisting of an eight-byte header followed by an array of size pointers. Initially, all distributions are reset to zero. Size should be chosen so that the resulting distributions are as close to size as possible and no less than the size of the vulnerable buffer. The value 1000 indicated above was determined empirically.

2.4 make holes


Remember that our goal is to gain control over the buffer that follows the vulnerable buffer. Assuming the defragmentation worked, we should have several adjacent buffers at the end of the heap, about the same size as the vulnerable buffer (not yet allocated) (Figure 3):


Figure 3. Defragmented heap with many distributions. We see a long line of buffers of the same size, which we control.

The next step is to free all the rest of these adjacent buffers, leaving alternate buffers and holes that match the size of the vulnerable buffer.

The first step in achieving this is the following code:

for (i=900; i<1000; i+=2){ 
     delete(bigdummy[i]);
 }

The lower limit in the for loop is based on the assumption that after 900 distributions in the defragmentation phase, we have reached the point where all subsequent distributions occur contiguously at the end of the heap.

Unfortunately, at this stage we have a problem. Simply deleting an object in JavaScript does not immediately cause the object space on the heap to be freed. The space in the heap is not cleared until garbage collection occurs. Internet Explorer provides the CollectGarbage () method, which immediately starts garbage collection, but other browser implementations do not. In particular, WebKit does not. Therefore, we digress into the discussion of garbage collection in WebKit.

Our review of the WebKit source code showed that there are three main events that can trigger garbage collection. To understand these events, you need to be familiar with how JavaScript code in WebKit manipulates objects.

The implementation supports two structures: primaryHeap and numberHeap, each of which is an array of pointers to CollectorBlock objects. CollectorBlock is an array of cells of fixed sizes, and each cell can contain an JSCell object (or derivative). Each JavaScript object occupies a cell in one of these heaps. Large objects (such as arrays and strings) take up additional memory in the system heap. We refer to this additional system memory as related storage.

Each CollectorBlock maintains a linked list of free cells. When selection is requested and there are no free cells in any existing CollectorBlocks, a new CollectorBlock is allocated.

All JavaScript objects obtained from JSObject are allocated in primaryHeap. numberHeap is reserved for NumberImp objects. Please note that the latter do not correspond to JavaScript Number objects, but, as a rule, they are short-circuit objects corresponding to intermediate arithmetic calculations.

When garbage collection starts, both heaps are checked for objects with zero references, and such objects (and the associated storage, if any) are freed.

Now we return to the three events found that trigger garbage collection:

1. Expiration of the dedicated garbage collection timer.
2. A distribution request that occurs when all of the individual CollectorBlocks of the heap are full.
3. Selecting an object with a sufficiently large associated repository (or several such objects).

The first of these events is not very useful, because JavaScript does not have a standby mode, and any noticeable delay in the script may cause the Slow Script dialog box to appear.

The last of these events depends on the number of objects in primaryHeap and the size of their associated storage. Experiments show that the state of primaryHeap varies greatly depending on the number of web pages open in the browser and the extent to which these pages use JavaScript. Therefore, reliably starting garbage collection with this event is difficult.

On the other hand, numberHeap seems to be relatively insensitive to these changes. Studies show that numberHeap only supports one dedicated CollectorBlock, even with significant pageviews with JavaScript. Since numberHeap CollectorBlock has 4062 Cells, the JavaScript code that creates a lot of NumberImps (performs a lot of arithmetic operations with intermediate calculations) should initiate garbage collection. As an example, manipulating doubles is the operation that results in the assignment of NumberImp from numberHeap, so the following JavaScript code can be used to start garbage collection:

for (i=0; i<4100; i++) {
     a = .5;
 }

After completing this code, the heap should look like Figure 4, and we are ready for the next step:


Figure 4. Managed heap with all the buffers freed. Allocation of the vulnerable buffer ends in one of the holes.

2.5 Preparation of blocks.


This step is simple. We use the following JavaScript:

for (i=901; i<1000; i+=2) { 
    bigdummy[i][0] = new Number(i);
 }

The code bigdummy [i] [0] = new Number (i) creates a new NumberInstance object and stores a pointer to this object in the ArrayStorage object corresponding to bigdummy [i]. Figure 5 shows part of the heap after running JavaScript: Figure 5. Details of the block that the attacker controls, just before the overflow starts


.

2.6. Start redistribution and overflow.


Now it's time to allocate a vulnerable buffer. If the previous steps have passed, as expected, the distribution for the vulnerable buffer will be in one of the holes that we created, and we are ready for overflow. The overflow object is to overwrite the pNI pointer in the ArrayStorage object, which follows the vulnerable buffer. The new value should be the sled address for the shellcode. Details about sled will be discussed below, but for now, note that typical sled NOPs (https://en.wikipedia.org/wiki/NOP_slide) are not suitable here. After allocation and overflow, the heap should look like shown in Figure 6:


Figure 6. Information about the block controlled by the attacker immediately after the overflow starts.

2.7 Starting the transition to shellcode


The transition to shellcode is performed by simple interaction with Number objects created in the preparation of the blocks above. More specifically, we need to force a virtual method of the underlying NumberInstance object in the JavaScript implementation. For blocks that have not been overwritten, execution is transferred to * ((* pNI) + 4 * k), where k is the index of the method in the virtual function table that is called. For the block that immediately follows the vulnerable buffer, execution is transferred to * ((* pSled) + 4 * k). This pSled double dereferencing is a little annoying, but the case study below shows an easy way to deal with it.

The following JavaScript makes a virtual function call for each NumberInstance object and thereby starts shellcode execution.

for (i=901; i<1000; i+=2) {
     document.write(bigdummy[i][0] + "
"); }

3. Case study


Our research in the area of ​​reliable work with a bunch of JavaScript was motivated by a vulnerability that we found in the PCRE (Perl-Compatible Regular Expression) WebKit analysis. It was an integer overflow that allowed an arbitrary overflow size behind the buffer containing the compiled regular expression, which could be any size up to 65535 bytes. However, the overflow occurred very soon after the allocation of the buffer, so we often encountered unallocated memory during the overflow. In other cases, we rewrote important data, but it seemed completely unpredictable what data there were changed between runs. The equipment described in section 2 solved these problems for us and allowed us to ensure reliable operation.

First we needed to defragment the heap for a target size of about 4000 bytes. The following debugger output shows how to use the first few distributions to defragment the heap (note the jump around the distribution addresses): By the time we have done almost 1000 distributions, the heap starts to look quite predictable and all distributions are ultimately contiguous Freeing the heap holes at the end, using the garbage collection technique described in Section 2.4, we see that a vulnerable buffer drops out at the last hole at 0x17168000, right in front of the data that we control at 0x17169000.

Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$1 = 0x16278c78
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$2 = 0x50d000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$3 = 0x510000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$4 = 0x16155000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$5 = 0x1647b000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$6 = 0x1650f000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$7 = 0x5ac000




Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$997 = 0x17164000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$998 = 0x17165000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$999 = 0x17166000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$1000 = 0x17167000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$1001 = 0x17168000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$1002 = 0x17169000




Breakpoint 2, 0x95846748 in jsRegExpCompile () regex buffer at$1004 = 0x17168000


Therefore, we will overflow our regex buffer into ArrayStorage data. Overflow bytes must be compiled into regular expression bytes. Fortunately, the construction of the regex character class allows for almost arbitrary bytes in compiled form, since it compiles into 33 bytes, the last 32 of which are an array with 256-bit bits, where the bit set to one means that this character is in the class . Thus, we use the character class [\ x00 \ x59 \ x5c \ x5e] and arrange them at the beginning of the ArrayStorage data, since the first 3 words of its compiled bit class are nonzero dword, zero dword, and the address in our heap, namely:

0x00000001 0x00000000 0x52000000

Finally, we use the specially crafted address on the heap, using the large dword string 0x52780278, followed by the shellcode. We will arrange the distribution of spraying so that this address is within the scope of what we need. Then, when it is interpreted as an instruction after the start of execution, it becomes conditional jumps:

78 02: js +0x2
78 52: js +0x52

which are effective NOP regardless of the value of the transition condition: if the condition is true, a jump of two is taken to the beginning of the next command, and if the condition is false, this is also not true for the 0x52 jump. This means the most significant byte, when determining the heap spray address is not fully used in sled, and is also aligned by 4 bytes.

4. Conclusions


The technique described in this article made it possible to ensure reliable operation of buffer overflows, which initially did not have predictable and interesting data for rewriting. While an attacker needs some control over the system, such as distribution size, overflow size and overflow data, this method should be applied to other browser vulnerabilities when the attacker has access to JavaScript. We suspect that such methods may be applicable when working with other client-side scripting languages.

Also popular now: