
Windows real-time memory management

(Interestingly, the ever-present holivorashchiki "it was a graphical shell, not an operating system" in the know about these extraordinary abilities?)
And how did she manage to?
Data management

The functions
GlobalLock
/ GlobalUnlock
and LockResource
/ FreeResource
were preserved in Win32API for compatibility with those ancient times, although in Win32 memory blocks (including resources) never moved. Functions
LockSegment
andUnlockSegment
(fix / free memory by address, not by handle) there was some time in the documentation marked “obsolete, do not use”, but now they don’t even have any memory left. For those who need to fix the memory for a long period of time, there was another function
GlobalWire
- “so that the block does not stick out in the middle of the address space, move it to the lower edge of the memory and fix it there”; it corresponded GlobalUnwire
, completely equivalent GlobalUnlock
. This pair of functions is, surprisingly, still alive in kernel32.dll, although they have already been removed from the documentation. Now they just perevyzyvayut GlobalLock
/ GlobalUnlock
. 
GlobalLock
replaced with a "stub": now Windows can shuffle memory blocks without changing their "virtual address" visible to the application (selector: offset) - which means that the application no longer needs to fix non-uploadable objects. In other words, pinning now prevents the block from unloading, but does not prevent it (invisible to the application) from moving. Therefore, to fix the data “for real” in the physical memory, for those who need just that (for example, to work with external devices), a couple of GlobalFix
/ was added GlobalUnfix
. Just like GlobalWire
/ GlobalUnwire
, in Win32 these functions have become useless; and they are likewise removed from the documentation, although they remain in kernel32.dll, and re-call GlobalLock
/ GlobalUnlock
.Code management
The trickiest starts here. Blocks of code - as well as immutable data - were deleted from memory, and then loaded from an executable file. But how did Windows ensure that programs did not try to call functions in unloaded blocks? One could access functions through handles, and call a hypothetical one before each function call
LockFunction
; but remember that many functions twist the “message loop”, for example, show a window or execute DDE commands, and you could unload them too, because in fact, their code is not needed at this time. However, when using the “function handles”, the function segment will not be freed until it returns control to the calling function. 
So Windows goes through the stacks of all running tasks (the so-called execution contexts in Windows, until the processes and threads were separated), finds the return addresses leading inside the unloaded segments, and replaces them with the reload thunks addresses - “stubs” that load the desired segment from the executable file, and transfer control inside it, as if nothing had happened.
So that Windows can walk on the stack, programs must support it inthe correct format : no FPO, the stack frame must begin with
BP
- a pointer to the frame of the calling function. (Since the stack consists of 16-bit words, the value is BP
always even.) In addition, Windows must distinguish between intra-segment (“close”) and inter-segment (“far”) calls in the stack, and it can ignore close calls - they are for sure Do not lead to the unloaded segment. Therefore, they decided that an odd value BP
on the stack means a distant call, i.e. each distant function should begin with a prologue INC BP; PUSH BP; MOV BP,SP
and end with an epilogue POP BP; DEC BP; RETF
(Actually, the prologue and epilogue were more complicated , but this isn’t about that.) We 
int 3fh
, and three more service bytes indicating where to look for the function. The handler int 3fh
finds these service bytes at its return address; defines the desired segment; loads it into memory if it is not already loaded; and finally overwrites the stub in the input table with an absolute transition jmp xxxx:yyyy
to the function body, so that subsequent calls to the same function are slowed down by only one inter-segment transition, without interruption.Now, when Windows unloads the function, it is enough for it to replace the inserted transition back to the stub in the module input table
int 3fh
. The system does not need to search for all calls to the unloaded function - they were all found even at compilation! The module’s “entry table” contains all the distant functions that the compiler knows about the existence of intersegment calls (this includes, in particular, exported functions and WinMain
), as well as all distant functions that were passed somewhere by pointer, which means they could be called from anywhere, even outside the program code (this includes WndProc
, EnumFontFamProc
and other callback-function). 
GetWindowLong(GWL_WNDPROC)
and similar calls also indicate a stub, not a function body. It’s even GetProcAddress
tricky, and instead of the function address it returns the address of its stub in the DLL entry table. (In Win32, only the DLL retained the analogue of the “input table” under the name of “export table”.) Static intermodule calls (calls of functions imported from DLLs) resolve using the same one GetProcAddress
, and therefore, they end up stubbing in exactly the same way. In any case, it turns out that when unloading the function, it is enough to fix the stub, and you do not need to touch the calling code itself. All this wisdom with relocatable code segments came to Windows “by inheritance” from an overlay linker for DOS. Like, at first the whole scheme is exactly in this form- appeared in the Zortech C compiler, and then in Microsoft C. When the executable file format for Windows was created, the existing overlay format for DOS was taken as the basis.

int 3fh
or replacing it jmp
) theresar byte ptr cs:[xxx], 1
, which resets the byte counter from 1 to 0 each time the function is called. This instruction just takes five bytes: you can save the existing executable file format, and load the stubs int 3fh
through one, interspersed with a counter instruction. Counter values for all code segments are initialized to 1, and every 250ms, Windows bypasses all modules, collects updated values, and reorders code segments in its LRU list. Calls to data segments can be tracked without any tricks: all such calls are already marked by a clear call
GlobalLock
or similar features. So when it comes time to unload a segment to free up memory - Windows will try to unload the segment that has not been accessed the longest: either a code segment whose counter has not been reset to 0 for the longest time, or a data segment that has not been lasted the longest was fixed. Windows advertisements 1.0-2.1 taken on GUIdebook