Why 0x00400000 is the default base address for EXE
- Transfer
The default base address for the DLL is 0x10000000 , but for executable files it is 0x00400000. Why is there such a special meaning for EXE? What is so special about 4 megabytes ?
This relates to the size of the address space displayed by a single page table in the x86 architecture, and this design was chosen in 1987.
The only technical requirement for the base address of the EXE is a multiplicity of 64 KB . But some base address options are better than others.
The goal of choosing a base address is to minimize the likelihood that the modules will be moved. This means that collision should be prevented 1) with other objects that are already in the address space (which will cause the movement); 2) as well as with objects that may appear in the address space later (forcing their movement). For executable files, avoiding conflict with objects that may appear later means leaving the area of the address space that may be filled with DLLs. Since the operating system itself puts the DLL files in high addresses and the default base address for non-system DLLs is 0x10000000, the base address for the EXE must be somewhere younger than 0x10000000, and the younger the more space will remain before you start to conflict with libraries. But how low do you have to go?
Paragraph 1 means that you must also avoid objects that are already in memory. In Windows NT, there were not many at lower addresses. The only thing there was the PAGE_NOACCESS page, which occupied a null address to catch attempts to access the null pointer. Therefore, in Windows NT, you can place executable files at the base address 0x00010000, and many applications did just that.
But in Windows 95, at the lowest addresses, much more was loaded. The Windows 95 Virtual Machine Manager constantly mapped the first 64 KB of physical memory to the first 64 KB of virtual memory to avoid CPU errors. (Windows 95 had to bypass many CPU bugs and firmware bugs) Moreover, the entire first megabyte of virtual address space was mapped to the logical address space of the active virtual machine. (For pedants: in reality, a little more than a megabyte ). This mapping method was a requirement of the x86 processor virtual-8086 mode.
Windows 95, like its predecessor, Windows 3.1, ran Windows in a special virtual machine (known as System VM), and for compatibility still passed a variety of things through 16-bit code, just to make sure that the duck quacked correctly . Therefore, even when the CPU processed the Windows application (and not the MS-DOS application), the mapping of the address space of the virtual machine was preserved, so that it was not necessary to re-do it (and at the same timeresource-intensive procedure for translating buffer addresses ) every time you need to start compatibility mode with MS-DOS.
So, the first megabyte of address space leaves the scene. What about the other three megabytes?
Now we return to a small hint at the beginning of the article.
To quickly switch contexts, the Windows 3.1 Virtual Machine Manager rounded the context of each virtual machine to 4 MB. He acted so that the context could be switched by updating one 32-bit value in the page table. (For pedants: you need to process and attribute pagesbut it's only a dozen or so bits). Due to rounding, we lose three megabytes of address space, but since we have 4 gigabytes of address space available, a loss of less than 0.1% seemed like a small sacrifice for a significant performance improvement. (Moreover, at that time not a single application came close to this limit. In general, the computer had only 2 MB of physical memory).
The way of displaying memory was moved to Windows 95, with some amendments for working with separate address spaces of 32-bit Windows applications . As a result of this, the lowest address where you can download the executable file in Windows 95 was 4 MB, i.e. 0x00400000.
Trivia for geeks. Toto prevent Win32 applications from accessing the memory area that is used for MS-DOS compatibility mode , the simple data selector was actually an extensible down selector that stopped at the 4 MB border. (Similarly, a null pointer in a 16-bit Windows application resulted in an access violation because the null selector was invalid).
The linker selects the default base address for the executable files as 0x0400000, so that the EXE can load without moving on both Windows NT and Windows 95. Nobody really cares about optimizing for Windows 95, so now, in principle, the linker developers could choose a different default base address. But there is no particular incentive to do this, except for the aesthetic pleasure of harmony in the diagram, especially since ASLR questions this harmony anyway. And besides, if they change the base address, then people will start to ask: “Why is it that some executable files have a base address of 0x04000000, and others have 0x00010000?”.
TL; DR: For quick context switching.
This relates to the size of the address space displayed by a single page table in the x86 architecture, and this design was chosen in 1987.
The only technical requirement for the base address of the EXE is a multiplicity of 64 KB . But some base address options are better than others.
The goal of choosing a base address is to minimize the likelihood that the modules will be moved. This means that collision should be prevented 1) with other objects that are already in the address space (which will cause the movement); 2) as well as with objects that may appear in the address space later (forcing their movement). For executable files, avoiding conflict with objects that may appear later means leaving the area of the address space that may be filled with DLLs. Since the operating system itself puts the DLL files in high addresses and the default base address for non-system DLLs is 0x10000000, the base address for the EXE must be somewhere younger than 0x10000000, and the younger the more space will remain before you start to conflict with libraries. But how low do you have to go?
Paragraph 1 means that you must also avoid objects that are already in memory. In Windows NT, there were not many at lower addresses. The only thing there was the PAGE_NOACCESS page, which occupied a null address to catch attempts to access the null pointer. Therefore, in Windows NT, you can place executable files at the base address 0x00010000, and many applications did just that.
But in Windows 95, at the lowest addresses, much more was loaded. The Windows 95 Virtual Machine Manager constantly mapped the first 64 KB of physical memory to the first 64 KB of virtual memory to avoid CPU errors. (Windows 95 had to bypass many CPU bugs and firmware bugs) Moreover, the entire first megabyte of virtual address space was mapped to the logical address space of the active virtual machine. (For pedants: in reality, a little more than a megabyte ). This mapping method was a requirement of the x86 processor virtual-8086 mode.
Windows 95, like its predecessor, Windows 3.1, ran Windows in a special virtual machine (known as System VM), and for compatibility still passed a variety of things through 16-bit code, just to make sure that the duck quacked correctly . Therefore, even when the CPU processed the Windows application (and not the MS-DOS application), the mapping of the address space of the virtual machine was preserved, so that it was not necessary to re-do it (and at the same timeresource-intensive procedure for translating buffer addresses ) every time you need to start compatibility mode with MS-DOS.
So, the first megabyte of address space leaves the scene. What about the other three megabytes?
Now we return to a small hint at the beginning of the article.
To quickly switch contexts, the Windows 3.1 Virtual Machine Manager rounded the context of each virtual machine to 4 MB. He acted so that the context could be switched by updating one 32-bit value in the page table. (For pedants: you need to process and attribute pagesbut it's only a dozen or so bits). Due to rounding, we lose three megabytes of address space, but since we have 4 gigabytes of address space available, a loss of less than 0.1% seemed like a small sacrifice for a significant performance improvement. (Moreover, at that time not a single application came close to this limit. In general, the computer had only 2 MB of physical memory).
The way of displaying memory was moved to Windows 95, with some amendments for working with separate address spaces of 32-bit Windows applications . As a result of this, the lowest address where you can download the executable file in Windows 95 was 4 MB, i.e. 0x00400000.
Trivia for geeks. Toto prevent Win32 applications from accessing the memory area that is used for MS-DOS compatibility mode , the simple data selector was actually an extensible down selector that stopped at the 4 MB border. (Similarly, a null pointer in a 16-bit Windows application resulted in an access violation because the null selector was invalid).
The linker selects the default base address for the executable files as 0x0400000, so that the EXE can load without moving on both Windows NT and Windows 95. Nobody really cares about optimizing for Windows 95, so now, in principle, the linker developers could choose a different default base address. But there is no particular incentive to do this, except for the aesthetic pleasure of harmony in the diagram, especially since ASLR questions this harmony anyway. And besides, if they change the base address, then people will start to ask: “Why is it that some executable files have a base address of 0x04000000, and others have 0x00010000?”.
TL; DR: For quick context switching.