Where do security gaps come from?
- Transfer
- Tutorial
“We have a security hole.” “Well, at least something is safe with us.” Joke
If you are a Windows user, you should be familiar with pop-up windows every second Tuesday of the month reporting the installation of "critical security updates." Microsoft is making considerable efforts to constantly fix vulnerabilities in its operating systems, but it's worth it: in a world where the number of cyber attacks is increasing day by day, not a single loophole in the defense of our computers should remain open to potential attackers.
Recent discussionon the 66192nd SVN ReactOS audit mailing list, it was shown how easy it is to add critical vulnerability to the kernel code. I will use this case as an example of a simple, but security-related error, as well as to illustrate some of the measures to which kernel code should be exposed if you really want to get a safe system.
Find vulnerability
Let's take a look at this code and see what's what:
NTSTATUS
APIENTRY
NtUserSetInformationThread(IN HANDLE ThreadHandle,
IN USERTHREADINFOCLASS ThreadInformationClass,
IN PVOID ThreadInformation,
IN ULONG ThreadInformationLength)
{
[...]
switch (ThreadInformationClass)
{
case UserThreadInitiateShutdown:
{
ERR("Shutdown initiated\n");
if (ThreadInformationLength != sizeof(ULONG))
{
Status = STATUS_INFO_LENGTH_MISMATCH;
break;
}
Status = UserInitiateShutdown(Thread, (PULONG)ThreadInformation);
break;
}
[...]
}
This is a small piece of the NtUserSetInformationThread function, which is a system call in win32k.sys, which can be (more or less) directly called by user programs. Here ThreadInformation is a pointer to a certain block with data, and the ThreadInformationClass parameter shows how this data should be interpreted. If it is UserThreadInitiateShutdown, the block must have a 4-byte integer. The number of bytes transferred is stored in ThreadInformationLength, and, as you can easily see, the code really checks to see exactly “4”, otherwise the execution will be interrupted with the STATUS_INFO_LENGTH_MISMATCH error. But note that both of these parameters come directly from the user program, which means that some malicious bookmark, calling this function, can pass anything to it.
Now let's see what happens with ThreadInformation when it is passed to UserInitiateShutdown:
NTSTATUS
UserInitiateShutdown(IN PETHREAD Thread,
IN OUT PULONG pFlags)
{
NTSTATUS Status;
ULONG Flags = *pFlags;
[...]
*pFlags = Flags;
[...]
/* If the caller is not Winlogon, do some security checks */
if (PsGetThreadProcessId(Thread) != gpidLogon)
{
// FIXME: Play again with flags...
*pFlags = Flags;
[...]
}
[...]
*pFlags = Flags;
return STATUS_SUCCESS;
}
Since quite a large part of this function has not yet been implemented, everything that happens above is just a few read and write cycles of the 4-byte value that the user pointed to.
So what's the problem then?
Well, just dereferencing an unverified pointer is enough to make a DoS attack (denial of service) possible - a malicious program can shut down a computer without having the right to do it. For example, a program can simply pass a NULL pointer and thus exploit the vulnerability. UserInitiateShutdown will dereference the mentioned pointer, which will lead to a BSOD, usually called a “bug check” among kernel developers. In this case, the caller has the ability to write to memory (here we recall that this is arbitrarypointer - it can even refer to the kernel area!). At first glance, writing back a value read from a specified memory area back there does not look so bad. But in reality it can bring enough problems. Some sections of memory often change with high intensity, and restoring the value previously stored there may, for example, reduce the entropy level of a random number generator of some cryptographic algorithm, or rewrite the table of memory pages with its old version, which should have already been destroyed by this moment, allowing you to access more memory, which can be used to compromise the system. But these are all just examples born along the way - and targeted attackers may have months to come to the best solution to achieve their goals, and a seemingly small security flaw, such as this one, may turn out to be enough for someone to pull all secrets from your car and gain complete control over it. Of course, when the function is fully implemented, it will change the Flags variable before writing it back, providing the ability to modify an arbitrary piece of memory (kernel), and in a controlled way - a real celebration for a hacker.
Knowing all this, what can be fixed?
To protect against this kind of problems, the NT kernel has two mechanisms: probing (check) and SEH (structured exception handling, Structured Exception Handling). The memory check eliminates a large number of problems, allowing you to make sure that the pointer received from the application really refers to the user's memory space. Performing such a check for all pointer parameters gives confidence that user-level programs will not be able to access kernel memory in this way. However, this does not save from null, or any other invalid pointers. And here the second mechanism, SEH, comes to the rescue: wrapping each access to data on questionable pointers (i.e., received from user programs) in an exception handling unit ensures that the code remains stable, even if the pointer is invalid. The kernel level code in this case provides an exception handler that is called whenever a protected code throws an exception (such as an access violation due to the use of an invalid pointer). The exception handler collects available information (such as an exception code), performs all necessary actions to clear the memory, and returns, in most cases, control to the user, along with the error code.
Let's look at the corrected source code ( commit r66223 ):
ULONG CapturedFlags = 0;
ERR("Shutdown initiated\n");
if (ThreadInformationLength != sizeof(ULONG))
{
Status = STATUS_INFO_LENGTH_MISMATCH;
break;
}
/* Capture the caller value */
Status = STATUS_SUCCESS;
_SEH2_TRY
{
ProbeForWrite(ThreadInformation, sizeof(CapturedFlags), sizeof(PVOID));
CapturedFlags = *(PULONG)ThreadInformation;
}
_SEH2_EXCEPT(EXCEPTION_EXECUTE_HANDLER)
{
Status = _SEH2_GetExceptionCode();
}
_SEH2_END;
if (NT_SUCCESS(Status))
Status = UserInitiateShutdown(Thread, &CapturedFlags);
/* Return the modified value to the caller */
_SEH2_TRY
{
*(PULONG)ThreadInformation = CapturedFlags;
}
_SEH2_EXCEPT(EXCEPTION_EXECUTE_HANDLER)
{
Status = _SEH2_GetExceptionCode();
}
_SEH2_END;
Note that all calls to the unsafe ThreadInformation pointer are now performed inside _SEH2_TRY blocks. Exceptions arising in them will be controlled by the code from the _SEH2_EXCEPT block. In addition, before dereferencing the pointer for the first time, a call to ProbeForWrite is made, which will raise an exception STATUS_ACCESS_VIOLATION or STATUS_DATATYPE_MISALIGNMENT if an invalid pointer (belonging to the kernel area, for example) or write-protected memory is detected. At the end, pay attention to the introduced variable CapturedFlags, which is passed to UserInitiateShutdown. Such a trick simplifies operations with an insecure parameter: in order not to use SEH whenever accessing pFlags inside a function, this value is stored in a trusted area by NtUserSetInformationThread, and then written back to user memory when UserInitiateShutdown works. This eliminates the need to edit UserInitiateShutdown itself, since now it receives a safe pointer to the input from the kernel area (pointer to CapturedFlags). The result of all these measures - the function can now work with absolutely any set of user data, correct, and not very, without risk of harming the system. It is done!
What lesson should be learned from this?
Obviously, increased vigilance even at the development stage allows you to notice lines of code that could become a security risk in the future. You must not allow too many of them, because, frankly, without them, there will certainly be many security problems. In the future, if everything goes according to plan, we will gradually look for them and fix them, releasing regular updates, such as those that come to you on Tuesdays from the Windows Update Center.
Marginal note. As Alex Ionescu rightly pointed out, Windows itself has a vulnerability in the same function, NtUserSetInformationThread. Moreover, according to him, it is still not closed and actively exploited for any kind of jailbreaking devices like Surface RT. It was first described back in 2012 by a famous security researcher named Mateusz “jooro” Jurchik (Mateusz Jurczyk) (who, incidentally, often hangs out with us at the IRC;]). You can find his article on this topic in the blog: j00ru.vexillium.org/?p=1393
- Notes from the translator:
Please inform me of all typos, errors and inaccuracies in private messages.
Participated in the translation: Postscripter , al-tarakanoff , Alexey Bragin, Mabou