How STACKLEAK improves Linux kernel security
STACKLEAK is a Linux kernel security feature, originally developed by the creators of Grsecurity / PaX. I decided to bring STACKLEAK to the official vanilla kernel (Linux kernel mainline). This article will discuss the internal structure, the properties of this security function and its very long and difficult path in the mainline.
STACKLEAK protects against several classes of vulnerabilities in the Linux kernel, namely:
This security feature fits perfectly into the concept of the Kernel Self Protection Project (KSPP): security is more than just fixing errors. Absolutely all errors in the code cannot be fixed, and therefore the Linux kernel must safely work in erroneous situations, including when attempting to exploit vulnerabilities. More details on KSPP are available on the project wiki .
STACKLEAK is present as PAX_MEMORY_STACKLEAK in the grsecurity / PaX patch. However, the grsecurity / PaX patch has ceased to spread freely since April 2017. Therefore, the appearance of STACKLEAK in the vanilla core would be valuable for Linux users with increased requirements for information security.
Operating procedure:
At the time of this writing (09/25/2018), the 15th version of the patch series was sent . It contains an architecturally independent part and code for x86_64 and x86_32. The STACKLEAK support for arm64, developed by Laura Abbott from Red Hat, has already managed to get into the vanilla kernel 4.19.
This measure reduces the useful information that some leaks from the nuclear stack can give to user space.
An example of information leakage from the kernel stack is shown in Scheme 1.
Scheme 1.
However, leaks of this type become useless if at the end of the system call the used part of the kernel stack is filled with a fixed value (Scheme 2).
Scheme 2.
As a result, STACKLEAK blocks some attacks on uninitialized variables in the kernel stack. Examples of such vulnerabilities: CVE-2017-17712, CVE-2010-2963. A description of how to exploit the vulnerability CVE-2010-2963 can be found in the article Kees Cook.
The essence of the attack on an uninitialized variable in the kernel stack is shown in diagram 3.
Scheme 3.
STACKLEAK blocks attacks of this type, since the value that the nuclear stack fills at the end of a system call indicates an unused area in the virtual address space (Figure 4).
Scheme 4.
At the same time, an important limitation is that STACKLEAK does not protect against similar attacks performed in a single system call.
In the vanilla kernel (Linux kernel mainline), STACKLEAK is effective against stack depth overflow only in combination with CONFIG_THREAD_INFO_IN_TASK and CONFIG_VMAP_STACK. Both of these measures are implemented by Andy Lutomirski.
The simplest way to exploit this type of vulnerability is shown in diagram 5.
Scheme 5.
Overwriting certain fields in the thread_info structure at the bottom of the nuclear stack allows you to increase the privileges of the process. However, when the CONFIG_THREAD_INFO_IN_TASK option is enabled, this structure is removed from the nuclear stack, which eliminates the described way of exploiting the vulnerability.
A more advanced version of this attack is to rewrite data in the neighboring memory region by going out of the stack. Read more about this approach:
An attack of this type is reflected in
Diagram 6. Scheme 6.
In this case, CONFIG_VMAP_STACK serves as protection. When this option is enabled, a special memory page (guard page) is placed next to the nuclear stack, access to which leads to an exception (Figure 7).
Scheme 7.
Finally, the most interesting variant of stack overflow is the attack of type Stack Clash. The idea back in 2005 put forward Gael Delalau (Gael Delalleau).
In 2017, it was rethought by researchers from Qualys, callingThis technique is Stack Clash. The fact is that there is a way to jump over the guard page and overwrite data from the neighboring memory region (Figure 8). This is done using a variable length array (VLA, variable length array), the size of which is controlled by the attacker.
Diagram 8.
More information about STACKLEAK and Stack Clash can be found on the grsecurity blog .
How does STACKLEAK protect against Stack Clash in a nuclear stack? Before each call to alloca (), a check for stack overflow is performed. Here is the corresponding code from version 14 of the patch series:
However, this functionality was excluded from version 15. This was done primarily because of the controversial ban Linus Torvalds use BUG_ON () in patches on security of the Linux kernel.
In addition, the 9th version of the patch series led to a discussion, as a result of which it was decided to eliminate all arrays of variable length from the mainline core. About 15 developers have joined this work, and it will be finished soon .
I present the results of performance testing on x86_64. Hardware: Intel Core i7-4770, 16 GB RAM.
Test number 1, attractive: build the Linux kernel on a single processor core
Test number 2, unattractive:
Thus, the effect of STACKLEAK on system performance depends on the type of load. In particular, a large number of short system calls increases overhead. So It is necessary to evaluate the performance of STACKLEAK for the planned load before industrial operation.
STACKLEAK consists of:
The kernel stack cleanup is performed in the stackleak_erase () function. This function works before returning to user space after a system call. STACKLEAK_POISON (-0xBEEF) is written to the used part of the thread stack. The lowest_stack variable, which is constantly updated in stackleak_track_stack (), indicates the start point of the cleanup.
Stages of work stackleak_erase () are reflected in schemes 9 and 10.
Scheme 9.
Scheme 10.
Thus. stackleak_erase () cleans only the used part of the nuclear stack. That is why STACKLEAK is so fast. And if on x86_64 to clear all 16 KB of the kernel stack at the end of each system call, hackbench shows a performance drop of 40%.
Kernel code instrumentation at compile time is performed in the STACKLEAK GCC plugin.
GCC plugins are project-specific downloadable modules for the GCC compiler. They register new passes using GCC Pass Manager, providing callbacks for these passes.
So, for the complete STACKLEAK operation, stackleak_track_stack () calls are inserted into the code of functions with a large stack frame (stack frame). Also, before each alloca (), a call to the already mentioned stackleak_check_alloca () is inserted, and then a call to stackleak_track_stack ().
As already mentioned, in version 15 of the patch series from the GCC plugin, the insertion of stackleak_check_alloca () calls was excluded.
The STACKLEAK path in the mainline is very long and difficult (Figure 11).
Scheme 11. Progress on the implementation of the STACKLEAK in the Linux kernel mainline.
In April 2017, grsecurity creators closed their patches for the community, starting to distribute them only on a commercial basis. In May 2017, I decided to take on the task of introducing STACKLEAK into the vanilla core. So began the path of more than a year. The company Positive Technologies, in which I work, gives me the opportunity to be engaged in this task for some part of my working time. But basically, I spend on her "free" time.
Since last May, my series of patches has undergone a multiple review, has undergone significant changes, has been criticized twice by Linus Torvalds. I wanted to leave this whole venture many times. But at some point a strong desire appeared to reach the end. At the time of writing the article (09/25/2018), the 15th version of the patch series is located in the linux-next branch, meets all of the voiced requirements of Linus and is ready for the 4.20 / 5.0 kernel merge-window.
A month ago, I gave a report on this work at the Linux Security Summit. I give links to slides and videos :
STACKLEAK is a very useful feature of the Linux kernel security that blocks the exploitation of several types of vulnerabilities at once. In addition, the original author of the PaX Team was able to make it fast and beautiful in engineering. Therefore, the appearance of STACKLEAK in the vanilla core would be valuable for Linux users with increased requirements for information security. Moreover, the work in this direction attracts the attention of the Linux development community to the kernel self-defense tools.
STACKLEAK is eventually adopted into the Linux 4.20 kernel:
git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2d6bb6adb714b133db92ccd4bfc9c20f75f71f3f
The x86_64, x86_32 and 32 architectures are supported.
In addition, work has been completed to eliminate variable-length arrays from the Linux kernel code. In kernel version 4.20, the gcc compiler warning "-Wvla" is enabled : lkml.org/lkml/2018/10/28/189
STACKLEAK protects against several classes of vulnerabilities in the Linux kernel, namely:
- reduces useful information for an attacker that can be leaked from the nuclear stack to user space;
- blocks some attacks on uninitialized variables in the kernel stack;
- provides a means of dynamically detecting an overflow of a nuclear stack.
This security feature fits perfectly into the concept of the Kernel Self Protection Project (KSPP): security is more than just fixing errors. Absolutely all errors in the code cannot be fixed, and therefore the Linux kernel must safely work in erroneous situations, including when attempting to exploit vulnerabilities. More details on KSPP are available on the project wiki .
STACKLEAK is present as PAX_MEMORY_STACKLEAK in the grsecurity / PaX patch. However, the grsecurity / PaX patch has ceased to spread freely since April 2017. Therefore, the appearance of STACKLEAK in the vanilla core would be valuable for Linux users with increased requirements for information security.
Operating procedure:
- select STACKLEAK from grsecurity / PaX patch,
- carefully study the code and create a patch,
- send to LKML, get feedback, improve, repeat anew before accepting to mainline.
At the time of this writing (09/25/2018), the 15th version of the patch series was sent . It contains an architecturally independent part and code for x86_64 and x86_32. The STACKLEAK support for arm64, developed by Laura Abbott from Red Hat, has already managed to get into the vanilla kernel 4.19.
STACKLEAK: security properties
Clearing residual information in the kernel stack
This measure reduces the useful information that some leaks from the nuclear stack can give to user space.
An example of information leakage from the kernel stack is shown in Scheme 1.
Scheme 1.
However, leaks of this type become useless if at the end of the system call the used part of the kernel stack is filled with a fixed value (Scheme 2).
Scheme 2.
As a result, STACKLEAK blocks some attacks on uninitialized variables in the kernel stack. Examples of such vulnerabilities: CVE-2017-17712, CVE-2010-2963. A description of how to exploit the vulnerability CVE-2010-2963 can be found in the article Kees Cook.
The essence of the attack on an uninitialized variable in the kernel stack is shown in diagram 3.
Scheme 3.
STACKLEAK blocks attacks of this type, since the value that the nuclear stack fills at the end of a system call indicates an unused area in the virtual address space (Figure 4).
Scheme 4.
At the same time, an important limitation is that STACKLEAK does not protect against similar attacks performed in a single system call.
Detection of kernel stack overflow "in depth"
In the vanilla kernel (Linux kernel mainline), STACKLEAK is effective against stack depth overflow only in combination with CONFIG_THREAD_INFO_IN_TASK and CONFIG_VMAP_STACK. Both of these measures are implemented by Andy Lutomirski.
The simplest way to exploit this type of vulnerability is shown in diagram 5.
Scheme 5.
Overwriting certain fields in the thread_info structure at the bottom of the nuclear stack allows you to increase the privileges of the process. However, when the CONFIG_THREAD_INFO_IN_TASK option is enabled, this structure is removed from the nuclear stack, which eliminates the described way of exploiting the vulnerability.
A more advanced version of this attack is to rewrite data in the neighboring memory region by going out of the stack. Read more about this approach:
- in the presentation of " The Stack is Back " by John Oberheide (Jon Oberheide),
- in the " Exploiting Recursion in the Linux Kernel " article by Jan Horn.
An attack of this type is reflected in
Diagram 6. Scheme 6.
In this case, CONFIG_VMAP_STACK serves as protection. When this option is enabled, a special memory page (guard page) is placed next to the nuclear stack, access to which leads to an exception (Figure 7).
Scheme 7.
Finally, the most interesting variant of stack overflow is the attack of type Stack Clash. The idea back in 2005 put forward Gael Delalau (Gael Delalleau).
In 2017, it was rethought by researchers from Qualys, callingThis technique is Stack Clash. The fact is that there is a way to jump over the guard page and overwrite data from the neighboring memory region (Figure 8). This is done using a variable length array (VLA, variable length array), the size of which is controlled by the attacker.
Diagram 8.
More information about STACKLEAK and Stack Clash can be found on the grsecurity blog .
How does STACKLEAK protect against Stack Clash in a nuclear stack? Before each call to alloca (), a check for stack overflow is performed. Here is the corresponding code from version 14 of the patch series:
void __used stackleak_check_alloca(unsignedlong size){
unsignedlong sp = (unsignedlong)&sp;
structstack_infostack_info = {0};
unsignedlong visit_mask = 0;
unsignedlong stack_left;
BUG_ON(get_stack_info(&sp, current, &stack_info, &visit_mask));
stack_left = sp - (unsignedlong)stack_info.begin;
if (size >= stack_left) {
/*
* Kernel stack depth overflow is detected, let's report that.
* If CONFIG_VMAP_STACK is enabled, we can safely use BUG().
* If CONFIG_VMAP_STACK is disabled, BUG() handling can corrupt
* the neighbour memory. CONFIG_SCHED_STACK_END_CHECK calls
* panic() in a similar situation, so let's do the same if that
* option is on. Otherwise just use BUG() and hope for the best.
*/#if !defined(CONFIG_VMAP_STACK) && defined(CONFIG_SCHED_STACK_END_CHECK)
panic("alloca() over the kernel stack boundary\n");
#else
BUG();
#endif
}
}
However, this functionality was excluded from version 15. This was done primarily because of the controversial ban Linus Torvalds use BUG_ON () in patches on security of the Linux kernel.
In addition, the 9th version of the patch series led to a discussion, as a result of which it was decided to eliminate all arrays of variable length from the mainline core. About 15 developers have joined this work, and it will be finished soon .
Effect of STACKLEAK on performance
I present the results of performance testing on x86_64. Hardware: Intel Core i7-4770, 16 GB RAM.
Test number 1, attractive: build the Linux kernel on a single processor core
# time make
Результат на 4.18:
real 12m14.124s
user 11m17.565s
sys 1m6.943s
Результат на 4.18+stackleak:
real 12m20.335s (+0.85%)
user 11m23.283s
sys 1m8.221s
Test number 2, unattractive:
# hackbench -s 4096 -l 2000 -g 15 -f 25 -P
Средний результат на 4.18: 9.08 сек
Средний результат на 4.18+stackleak: 9.47 сек (+4.3%)
Thus, the effect of STACKLEAK on system performance depends on the type of load. In particular, a large number of short system calls increases overhead. So It is necessary to evaluate the performance of STACKLEAK for the planned load before industrial operation.
Internal device STACKLEAK
STACKLEAK consists of:
- A code that clears the kernel stack at the end of the system call (originally written in assembler),
- GCC plugin for instrumentation kernel code at compile time.
The kernel stack cleanup is performed in the stackleak_erase () function. This function works before returning to user space after a system call. STACKLEAK_POISON (-0xBEEF) is written to the used part of the thread stack. The lowest_stack variable, which is constantly updated in stackleak_track_stack (), indicates the start point of the cleanup.
Stages of work stackleak_erase () are reflected in schemes 9 and 10.
Scheme 9.
Scheme 10.
Thus. stackleak_erase () cleans only the used part of the nuclear stack. That is why STACKLEAK is so fast. And if on x86_64 to clear all 16 KB of the kernel stack at the end of each system call, hackbench shows a performance drop of 40%.
Kernel code instrumentation at compile time is performed in the STACKLEAK GCC plugin.
GCC plugins are project-specific downloadable modules for the GCC compiler. They register new passes using GCC Pass Manager, providing callbacks for these passes.
So, for the complete STACKLEAK operation, stackleak_track_stack () calls are inserted into the code of functions with a large stack frame (stack frame). Also, before each alloca (), a call to the already mentioned stackleak_check_alloca () is inserted, and then a call to stackleak_track_stack ().
As already mentioned, in version 15 of the patch series from the GCC plugin, the insertion of stackleak_check_alloca () calls was excluded.
Linux path kernel mainline
The STACKLEAK path in the mainline is very long and difficult (Figure 11).
Scheme 11. Progress on the implementation of the STACKLEAK in the Linux kernel mainline.
In April 2017, grsecurity creators closed their patches for the community, starting to distribute them only on a commercial basis. In May 2017, I decided to take on the task of introducing STACKLEAK into the vanilla core. So began the path of more than a year. The company Positive Technologies, in which I work, gives me the opportunity to be engaged in this task for some part of my working time. But basically, I spend on her "free" time.
Since last May, my series of patches has undergone a multiple review, has undergone significant changes, has been criticized twice by Linus Torvalds. I wanted to leave this whole venture many times. But at some point a strong desire appeared to reach the end. At the time of writing the article (09/25/2018), the 15th version of the patch series is located in the linux-next branch, meets all of the voiced requirements of Linus and is ready for the 4.20 / 5.0 kernel merge-window.
A month ago, I gave a report on this work at the Linux Security Summit. I give links to slides and videos :
Conclusion
STACKLEAK is a very useful feature of the Linux kernel security that blocks the exploitation of several types of vulnerabilities at once. In addition, the original author of the PaX Team was able to make it fast and beautiful in engineering. Therefore, the appearance of STACKLEAK in the vanilla core would be valuable for Linux users with increased requirements for information security. Moreover, the work in this direction attracts the attention of the Linux development community to the kernel self-defense tools.
PS
STACKLEAK is eventually adopted into the Linux 4.20 kernel:
git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2d6bb6adb714b133db92ccd4bfc9c20f75f71f3f
The x86_64, x86_32 and 32 architectures are supported.
In addition, work has been completed to eliminate variable-length arrays from the Linux kernel code. In kernel version 4.20, the gcc compiler warning "-Wvla" is enabled : lkml.org/lkml/2018/10/28/189