Exceptions for hardcore. Features of processing executions in dynamically allocated code

    image

    Modern versions of the OS impose security restrictions on the executable code. Under such conditions, using the exception mechanism in the injected code or, say, in a manually projected image can be a nontrivial task, if you do not know about some of the nuances. This article will focus on the internal device of the user-mode exception manager of Windows OS for x86 / x64 / IA64 platforms, as well as options for implementing a workaround for system restrictions.

    __try


    Suppose that in your practice a task has arisen that requires the implementation of full-fledged exception handling in code embedded in a foreign process, or you make another PE-packer / cryptor, which should ensure that exceptions are functional in the unpacked image. One way or another, it all boils down to the fact that code that uses exceptions is executed outside the image projected by the system loader, which will be the main cause of difficulties. As a demonstration of the problem, consider a simple example of code that copies its own image to a new area within the current AP process:

    void exceptions_test()
    {
        __try {
            int *i = 0;
            *i = 0;
        } __except (EXCEPTION_EXECUTE_HANDLER) {
            /* Сюда мы можем и не попасть */
            MessageBoxA(0, "Исключение перехвачено", "", 0);
        }
    }
    void main()
    {
        /* Проверяем работоспособность исключений */
        exceptions_test();
        /* Копируем текущий образ целиком в новую область */
        PVOID ImageBase = GetModuleHandle(NULL);
        DWORD SizeOfImage = RtlImageNtHeader(ImageBase)->OptionalHeader.SizeOfImage;
        PVOID NewImage = VirtualAlloc(NULL, SizeOfImage, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
        memcpy(NewImage, ImageBase, SizeOfImage);
        /* Правим релоки */
        ULONG_PTR Delta = (ULONG_PTR) NewImage - ImageBase;        
        RelocateImage(NewImage, Delta);
        /* Вызываем exceptions_test в копии образа */
        void (*new_exceptions_test)() = (void (*)()) ((ULONG_PTR) &exceptions_test + Delta);
        new_exceptions_test();
    }
    

    In the exceptions_test procedure, an attempt to access the null pointer is wrapped in a try-except MSVC extension, instead of an exception filter, there is a stub that returns EXCEPTION_EXECUTE_HANDLER, which should immediately lead to code execution in the except block. At the first call, exceptions_test fulfills, as expected: the exception is caught, the message box is displayed. But after copying the code to a new location and calling the copy of exceptions_test, the exception ceases to be processed, and the application simply “crashes” with a message about the unhandled exception that is characteristic of a particular OS version. The specific reason for this behavior will depend on the platform on which the test was conducted, and in order to determine it, it will be necessary to understand the mechanism for dispatching exceptions.


    Unhandled exception

    Exception dispatch


    Regardless of the platform and the type of exception, dispatching for user-mode always starts from the KiUserExceptionDispatcher point in the ntdll module, which control is transferred from the kernel KiDispatchException (if the exception was raised from user-mode and was not handled by the debugger). In the previous example, control is transferred to the dispatcher for both cases of an exception (during the execution of exceptions_test and its copy at the new address), you can verify this by setting breakpoint on ntdll! KiUserExceptionDispatcher. The KiUserExceptionDispatcher code is very simple and looks something like this:

    VOID NTAPI KiUserExceptionDispatcher (EXCEPTION_RECORD *ExceptionRecord, CONTEXT *Context)
    {
        NTSTATUS Status;
        if (RtlDispatchException(ExceptionRecord, Context)) {
            /* Исключение обработано, можно продолжать исполнение */
            Status = NtContinue(Context, FALSE);
        }
        else {
            /* Повторно выбрасываем исключение, но без попытки найти хендлер в этот раз */
            Status = NtRaiseException(ExceptionRecord, Context, FALSE);
        }
        ...
        RtlRaiseException(&NestedException);
    }
    

    where EXCEPTION_RECORD is the structure with information about the exception, and CONTEXT is the structure of the state of the thread context at the time the exception occurred. Both structures are documented in MSDN, however, you are probably already familiar with them. Pointers to this data are passed to ntdll! RtlDispatchException, where the actual dispatch is performed, while the mechanics of exception handling are different in 32-bit and 64-bit systems.

    x86


    The main mechanism for the x86 platform is Structured Exception Handling (SEH), based on a single-linked list of exception handlers located on the stack and always accessible from NT_TIB.ExceptionList. The basics of this mechanism have been repeatedly described in a variety of works (see the “Useful materials” box), so we will not repeat ourselves, but just focus on those points that intersect with our task.


    Dump chain SEH

    The fact is that in SEH all elements of the list of handlers must be on the stack, which means that they are potentially prone to overwriting when the buffer overflows on the stack. Which was successfully exploited by the creators of the exploits: the pointer to the handler was rewritten with the address necessary for executing the shellcode, and the pointer and the next element of the list were also rewritten, which led to a violation of the integrity of the chain of handlers. To increase resilience to attacks on programs using SEH, Microsoft developed mechanisms such as SafeSEH (a table with the addresses of "safe" handlers located in the IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG directory of the PE file), SEHOP (a simple check of the integrity of the chain of frames), and also integrated the corresponding system policy DEP verification,

    A simplified pseudo-code of the main dispatch procedure RtlDispatchException for the x86 version of the ntdll.dll library in Windows 8.1 can be represented (with some assumptions) as follows:

    void RtlDispatchException(...) // NT 6.3.9600
    {
        /* Вызов цепочки Vectored Exception Handlers */
        if (RtlpCallVectoredHandlers(exception, 1)) return 1;
        ExceptionRegistration = RtlpGetRegistrationHead();
        /* ECV (SEHOP) */
        if (!DisableExceptionChainValidation && 
            !RtlpIsValidExceptionChain(ExceptionRegistration, ...)) {        
                if (_RtlpProcessECVPolicy != 2) 
                    goto final;
                else
                    RtlReportException();
        }
        /* Перебираем цепочку хендлеров, пока не найдем подходящий */
        while (ExceptionRegistration != EXCEPTION_CHAIN_END) {
            /* Проверка границ стека */
            if (!STACK_LIMITS(ExceptionRegistration)) {
                ExceptionRecord->ExceptionFlags |= EXCEPTION_STACK_INVALID;
                goto final;
            }
            /* Валидация хендлера */
            if (!RtlIsValidHandler(ExceptionRegistration, ProcessFlags)) goto final;
            /* Передаем управление хендлеру */
            RtlpExecuteHandlerForException(..., ExceptionRegistration->Handler);
            ...
            ExceptionRegistration = ExceptionRegistration->Next;
        }
        ...
        final:
        /* Вызов цепочки Vectored Continue Handlers */
        RtlpCallVectoredHandlers(exception, 1);
    }
    

    From the presented pseudocode, we can conclude that for successful transfer of control to the SEH handler during dispatch scheduling, the following conditions must be met:

    1. The chain of SEH frames must be correct (end with the ntdll! FinalExceptionHandler handler). Verification is done with SEHOP enabled for the process.
    2. The SEH frame must be stacked.
    3. The SEH frame must contain a pointer to a valid handler.

    INFO


    For Vectored Exception Handling, no checks are performed in the dispatcher, which makes VEH a suitable tool when there is no need to bother with SEH support in the program.


    Call Stack for the Exclusion Filter

    If everything is very clear with the first two points and no additional steps are required to perform them, we will examine the procedure for checking the handler for "validity" in more detail. Handler checking is performed by the ntdll! RtlIsValidHandler function, the pseudocode of which for Vista SP1 was first introduced to the general public back in 2008 at the Black Hat conference in the States. Although it contained some inaccuracies, this did not prevent him from wandering in the form of copy-paste from one resource to another for several years. Since then, the code of this function has not undergone significant changes, and analysis of its version for Windows 8.1 has allowed us to compose the following pseudo-code:

    BOOL RtlIsValidHandler(Handler) // NT 6.3.9600
    {
        if (/* Handler в пределах образа */) {
            if (DllCharacteristics&IMAGE_DLLCHARACTERISTICS_NO_SEH)
                goto InvalidHandler;
            if (/* Образ является .Net сборкой, установлен ILonly флаг */)
                goto InvalidHandler;                 
            if (/* Найдена таблица SafeSEH */) {
                if (/* Образ зарегистрирован в LdrpInvertedFunctionTable (или ее кеше), либо инициализация процесса не завершена */) {
                    if (/* Handler найден в таблице SafeSEH */)
                        return TRUE;
                    else
                        goto InvalidHandler;
                }
            return TRUE;
        } else {
            if (/* ExecuteDispatchEnable и ImageDispatchEnable флаги установлены в ExecuteOptions процесса */) 
                return TRUE;
            if (/* Handler находится в неисполняемой области памяти */) {
                if (ExecuteDispatchEnable) return TRUE;
            }
            else if (ImageDispatchEnable) return TRUE;
        }
        InvalidHandler:
            RtlInvalidHandlerDetected(...);
            return FALSE;
    }
    

    In the above pseudo-code, the order of checking conditions has been changed (in the original, some conditions are checked twice, some are checked in nested functions). After analyzing the pseudo-code, we can conclude that for successful validation to pass, one of the sets of conditions must be satisfied under which the handler belongs to:

    • image without SafeSEH, without NO_SEH flag, without ILonly flag;
    • an image with SafeSEH, without the NO_SEH flag, without the ILonly flag, the image must be registered in LdrpInvertedFunctionTable (not required if an exception occurred at the time the process was initialized);
    • non-executable memory area, the flag ExecuteDispatchEnable (ExecuteOptions) must be set (it will work only when No Execute is disabled for the process);
    • executable memory area, the ImageDispatchEnable flag must be set.

    In this case, the memory area is considered the way if the MEM_IMAGE flag is set for it in the attributes of the region (attributes are obtained by the NtQueryVirtualMemory function), and the contents correspond to the PE structure. Process flags are obtained by the NtQueryInformationProces function from KPROCESS.KEXECUTE_OPTIONS. Based on the information received, to implement support for exceptions in dynamically allocated code on the x86 platform, there are at least three ways:

    1. Setting / replacing the ImageDispatchEnable flag for the process.
    2. Replacing the type of memory region with MEM_IMAGE (for a PE image without SafeSEH).
    3. Implementing your own exception manager bypassing all checks.

    We will consider each of these options in detail below. We should also mention SafeSEH support, which may be needed if you write, for example, a regular legal PE-packer or protector. To implement it, you will have to take care of manually adding a record about the mapped image (with a pointer to SafeSEH) to the ntdll global table! LdrpInvertedFunctionTable, while the functions working with this table directly are not exported by the ntdll.dll library and there’s a little sense to look for them manually: in old OS they still require a pointer to the table itself. If you find the pointer in any way, you will also have to take care of blocking access to the table for safe changes. An alternative would be to unpack the file into one of the unpacker sections and transfer the SafeSEH table from the unpacked file to the main image.

    Substitution of ExecuteOptions process


    ExecuteOptions (KEXECUTE_OPTIONS) - part of the KPROCESS kernel structure, which contains the DEP settings for the process. The structure has the form:

    typedef struct _KEXECUTE_OPTIONS {
        UCHAR ExecuteDisable : 1;
        UCHAR ExecuteEnable : 1;
        UCHAR DisableThunkEmulation : 1;
        UCHAR Permanent : 1;
        UCHAR ExecuteDispatchEnable : 1;
        UCHAR ImageDispatchEnable : 1;
        UCHAR Spare : 2;
    } KEXECUTE_OPTIONS, PKEXECUTE_OPTIONS;
    


    ExecuteOptions of the process with DEP enabled.

    The values ​​of these settings (flags) at the user level are obtained by the NtQueryInformationProcess function with the information class parameter equal to 0x22 (ProcessExecuteFlags). Flags are set in the same way by the NtSetInformationProcess function. Starting with Vista SP1, for processes with DEP enabled, the Permanent flag is set by default, which prohibits making changes to the settings after the process is initialized. A fragment of the KeSetExecuteOptions procedure called in kernel mode from NtSetInformationProcess confirms this:

    @PermanentCheck:        ; KeSetExecuteOptions +2Fh
    mov     al, [edi+6Ch]   ; current KEXECUTE_OPTIONS
    mov     byte ptr [ebp+arg_0+3], al
    test    al, 8           ; test Permanent
    jnz     short @Fail     ; возвращается 0C0000022h (STATUS_ACCESS_DENIED)
    

    Thus, while in user-mode, ExecuteOptions cannot be changed when DEP is activated. But the option remains to simply “trick” RtlIsValidHandler by setting the hook to NtQueryInformationProcess, where the flags will be replaced with the necessary ones. Installing such an interception will make exceptions in the code located outside the modules loaded by the system operational. Example interceptor code:

    NTSTATUS __stdcall xNtQueryInformationProcess(HANDLE ProcessHandle, INT ProcessInformationClass, PVOID ProcessInformation, ULONG ProcessInformationLength, PULONG ReturnLength)
    {
        NTSTATUS Status = org_NtQueryInformationProcess(ProcessHandle, ProcessInformationClass, ProcessInformation, ProcessInformationLength, ReturnLength);
        if (!Status && ProcessInformationClass == 0x22) /* ProcessExecuteFlags */
            *(PDWORD)ProcessInformation |= 0x20; /* ImageDispatchEnable */
        return Status;
    }
    

    Substitution of memory attributes


    An alternative way to substitute process flags is to substitute the attributes of the memory region in which the handler is located. As already noted, RtlIsValidHandler checks the type of allocated memory area, and if it matches MEM_IMAGE, the area is considered an image. It is impossible to assign MEM_IMAGE to the selected VirtualAlloc area, this type can only be set to display the (NtCreateSection) section for which the correct file handle is specified. As with the substitution of ExecuteOptions, you will need to intercept, this time the NtQueryVirtualMemory function:

    NTSTATUS NTAPI xNtQueryVirtualMemory(HANDLE ProcessHandle, PVOID BaseAddress, INT MemoryInformationClass, PMEMORY_BASIC_INFORMATION MemInformation, ULONG Length, PULONG ResultLength)
    {
        NTSTATUS Status = org_NtQueryVirtualMemory(ProcessHandle, BaseAddress, MemoryInformationClass, Buffer, Length, ResultLength);
        if (!Status && !MemoryInformationClass) /* MemoryBasicInformation */
        {
            if((UINT_PTR)MemInformation->AllocationBase == g_ImageBase) MemInformation->Type = MEM_IMAGE;
        }
        return Status;
    }
    

    The method is suitable for exceptions when injecting an entire PE image or for manually mapped images. In addition, this option is somewhat more preferable than the previous one, if only because it does not reduce the security of the process by partially disabling DEP (after all, do you need additional malware?). As a bonus, this method allows you to pass an internal check on the handler in modern versions of CRT using try-except and try-finally constructs (these constructs can also be used without CRT, more about this in the corresponding box). Validation in CRT is performed by the __ValidateEH3RN function, called from _except_handler3, it assumes the established MEM_IMAGE type for the region, as well as the correct PE structure.

    Native exception manager


    If the options for installing the hook are unsuitable for any reason or just don't like it, you can go even further and completely replace the SEH dispatch with your code, implementing all the necessary logic of the SEH dispatcher inside the vector handler. The pseudocode RtlDispatchException shows that VEH is called before processing the SEH chain begins. Nothing prevents us from taking control of the exception with the vector handler and deciding what to do with it and which handlers to call it. The VEH handler is installed with just one line:

    AddVectoredExceptionHandler(0, (PVECTORED_EXCEPTION_HANDLER) &VectoredSEH);
    

    where VectoredSEH is the handler, which is actually the SEH dispatcher. The complete call chain for this handler will look like this: KiUserExceptionDispatcher -> RtlDispatchException -> RtlpCallVectoredHandlers -> VectoredSEH. At the same time, the control of the calling function may not be returned, but it is to call NtContinue or NtRaiseException itself, depending on the success of scheduling. See the full source code for implementing SEH via VEH in the materials attached to this article, or on GitHub . The implementation code is fully operational, and the dispatch logic corresponds to the system one.


    SEH dispatcher inside a vector handler

    x64 and IA64


    In 64-bit versions of Windows for x64 and Itanium platforms, a completely different way of handling exceptions is used than in x86-versions. The method is based on tables containing all the information necessary for dispatch scheduling, including offsets at the beginning and end of the code block for which the exception is being processed. Therefore, in the code compiled for these platforms, there are no operations to install and remove a handler for each try-except block. The static exception table is located in the Exception Directory of the PE file and is an array of RUNTIME_FUNCTION structure elements that look like this:

    typedef struct _RUNTIME_FUNCTION {
        ULONG BeginAddress;
        ULONG EndAddress;
        ULONG UnwindData;
    } RUNTIME_FUNCTION, *PRUNTIME_FUNCTION;
    

    Pleasant moment: at the system level, exception support for dynamic code is implemented. If the code is located in a non-image memory region, or the exception table generated by the compiler is missing from this image, the information for exception handling is taken from the dynamic exception tables (DynamicFunctionTable). The pointer to the list is stored in ntdll! RtlpDynamicFunctionTable, several functions for working with the list are exported from ntdll.dll. A quick analysis of the listings of these functions allowed us to obtain the following structure of the DynamicFunctionTable list items:

    struct _DynamicFunctionTable {
        /* +0h */
        PVOID   Next;
        PVOID   Prev;           // Первый элемент указывает сам на себя
        /* +10h */
        PRUNTIME_FUNCTION Table;// Указатель на таблицу, для колбэка поле используется как ID|0x03
        PVOID   TimeCookie;     // ZwQuerySystemTime
        /* +20h */
        PVOID   RegionStart;    // Смещение относительно BaseAddress
        DWORD   RegionLength;   // Охватываемая таблицей (колбэком) область
        /* +30h */
        DWORD64 BaseAddress;    
        PGET_RUNTIME_FUNCTION_CALLBACK Callback;
        /* +40h */
        PVOID   Context;        // Пользовательский аргумент для колбэка
        DWORD64 CallbackDll;    // Указывает на +58h, если DLL определена
        /* +50h */
        DWORD   Type;           // 1 — table, 2 — callback
        DWORD   EntryCount;
        WCHAR   DllName[1];
    };
    


    Search algorithm RUNTIME_FUNCTION Elements are

    added by the functions RtlAddFunctionTable and RtlInstallFunctionTableCallback, deleted by RtlDeleteFunctionTable. All of these features are well documented on MSDN and very easy to use. An example of adding a dynamic table for the image that was just displayed manually:

    ULONG Size, Length;
    /* Получаем таблицу, сгенерированную компилятором, для отображаемого образа */
    PRUNTIME_FUNCTION Table = (PRUNTIME_FUNCTION) RtlImageDirectoryEntryToData(NewImage, TRUE, IMAGE_DIRECTORY_ENTRY_EXCEPTION, &Size);
    Length = Size/sizeof(PRUNTIME_FUNCTION);
    /* Добавляем таблицу образа в список DynamicFunctionTable */
    RtlAddFunctionTable(Table, Length, (UINT_PTR)NewImage);
    

    That's all, no hooks or custom exception managers, no workarounds for system checks. It should only be noted that DynamicFunctionTable is global for the process, so if the code for which the record is added has worked and needs to be deleted, then the corresponding record from the table should also be removed. Instead of adding a table, you can set a callback for a certain range of addresses in the AP, which will receive control every time a RUNTIME_FUNCTION record is needed for the code from this area. See the source code attached to the article for the version with installing the callback.


    Exception handled

    __finally


    Low-level programming under Windows using the native API does not impose exceptions as a method of error handling, and developers of "specific software" often either simply neglect them or limit themselves to setting the filter for unhandled exceptions or simply using VEH. Nevertheless, exceptions still remain a powerful mechanism by which you can extract the greater gain, the more complex the architecture of your program. And thanks to the methods discussed in the article, you can use exceptions even in the most extraordinary conditions.

    Useful materials



    I also recommend getting Windows Research Kernel (the main part of the NT5.2 kernel source code). WRK is distributed to universities and academic organizations, but it’s not for me to teach you how and where to look for such things.

    Try-except and try-finally constructs without CRT


    If you are going to use the constructions of exception and finalization blocks, then you should take care that the program has a procedure that the compiler substitutes for the real handler: for x86-projects this is __except_handler3, and for x64 it is __C_specific_handler. These procedures carry out their own dispatch: search and call the necessary handlers, as well as the promotion of the stack. There is no particular need to write them yourself, for an x86 project you can simply connect expsup3.lib from the old DDK (ntdll.lib from DDK also contains the necessary functions), for x64 it’s still easier: __C_specific_handler is exported with a 64-bit version of ntdll.dll, just use the correct lib file.

    image

    First published in Hacker Magazine # 195.
    Posted by Teq


    Subscribe to Hacker

    Also popular now: