Prefixes in the IA-32 Command System
Today I want to tell you about prefixes in the Intel IA-32 instruction system in 32- and 64-bit versions (also referred to as x86 and x86_64). But first, I’ll briefly recall the general structure of IA-32 instructions:

A more detailed description of the structure of the instructions can be found in the article "Disassembler with your own hands" and, of course, in Intel 64 and IA-32 Architectures Software Development Manuals . This article will talk about IA-32 prefixes, features related to their use, as well as trends in their development.
Almost from the very first Intel processors, single-byte prefixes began to be used in the IA-32 instruction system. About them it was already written on Habré , for this reason I will not talk about them.
With the expansion
It is not difficult to notice that the encodings of these instructions differ only in the prefix. Their opcode is the same
At some point, there was a need to support 64-bit address space and expand the number of addressable registers. AMD developers have successfully dealt with this task by adding a prefix named

It is worth noting several features associated with the use of this prefix. The encoding
With the introduction of the extension,

Field
Using a
It should be noted that the use of
Not so long ago, Intel announced the appearance of a new extension to the instruction set with the name

It is an improved version of the
Here are some, in my opinion, interesting features of the
In conclusion, I would like to note some reasons for the appearance of such a complex and, in some places, not logical system of commands. The history of the development of the Intel IA-32 instruction system begins in the 70s of the last century, when there was no talk of any 64-bit modes. In addition to Intel, AMD made a significant contribution to the evolution of the IA-32. A lot of effort has been put into maintaining backward compatibility between different processor models. Many interesting facts related to the development of the IA-32 architecture can be found in the article by A. Fog .
Thanks to Atakua for commenting on the drafts of this article.
PS All illustrations are from Intel 64 and IA-32 Architectures Software Development Manuals .

- Prefixes May be absent. Several may be present at once.
- Opcode. May consist of one, two or three bytes.
- Mod_R / M bytes. Used to address operands. It may not be encoded if the instruction does not have explicit operands.
- SIB (Scale Index Base) bytes. The second byte used to address operands in memory. May be absent.
- Address offset bytes ( Engl. Displacement). 1, 2, 4 or not a single byte.
- Constant ( English immediate). 1, 2, 4 or not a single byte.
A more detailed description of the structure of the instructions can be found in the article "Disassembler with your own hands" and, of course, in Intel 64 and IA-32 Architectures Software Development Manuals . This article will talk about IA-32 prefixes, features related to their use, as well as trends in their development.
Single Byte Prefixes
Almost from the very first Intel processors, single-byte prefixes began to be used in the IA-32 instruction system. About them it was already written on Habré , for this reason I will not talk about them.
Mandatory Prefixes
With the expansion
SSE
of the single-byte prefix, namely 0xf2
, 0xf3
, 0x66
in some cases, we began to have the meaning of the opcode. There were so-called mandatory prefix ( Eng. Mandatory prefixes). Examples of such instructions are given below.Encoding | Instruction manual | Mandatory Prefix |
---|---|---|
0x0f 0x10 | MOVUPS | - |
0xf2 0x0f 0x10 | MOVSD | 0xf2 |
0xf3 0x0f 0x10 | MOVSS | 0xf3 |
0x66 0x0f 0x10 | MOVUPD | 0x66 |
It is not difficult to notice that the encodings of these instructions differ only in the prefix. Their opcode is the same
0x0f 0x10
. Moreover, the semantics of these instructions are different. For example, it MOVSD
copies 64 bits from one operand to another, and MOVUPD
128 bits.REX Prefix
At some point, there was a need to support 64-bit address space and expand the number of addressable registers. AMD developers have successfully dealt with this task by adding a prefix named
REX
. This prefix is also single-byte, and has the form 0x4*
. Its bits are used to expand existing fields encoded in a Mod_R/M
byte, as well as the width of the operand. The figure shows an example of using a REX
prefix for register addressing. 
It is worth noting several features associated with the use of this prefix. The encoding
0x4*
corresponds to the prefix only in 64-bit mode, in all other modes it corresponds to the options for instructionsINC/DEC
. An interesting property of this prefix is that it must be located immediately before the byte of the opcode, otherwise it is ignored. If the REX
prefix is used with an instruction requiring the presence of another mandatory prefix, it must be located between this prefix and the byte of the operation code.VEX Prefix
With the introduction of the extension,
AVX
a new prefix, named, appeared in the IA-32 command system VEX
. It is no longer single-byte. It can consist of either two or three bytes, depending on the first byte of the prefix. 0xc4
and 0xc5
accordingly. 
Field
R
, X
, B
, W
bear the same meaning as the corresponding field REX
prefix. The pp field provides functionality equivalent to mandatory SIMD
prefixes (for example, b01
= 0x66
). And the field m-mmmm
can correspond to two integer bytes of the opcode (for example, 0b00011
= 0x0f 0x3a
). The field L
determines the length of the vector: 0 - 128 bits, 1 - 256 bits. Using a
VEX
prefix provides the following benefits:- Support for up to four operands.
- Support for 128-bit
XMM
registers and 256-bitYMM
registers. - Compression of the encoding of already entered instructions.
- Removing the need to use a
REX
prefix for addressing general purpose registers (R8
-R15
), vector registersXMM8
-XMM15
(YMM8
-YMM15
).VEX
allows you to encode the same fields asREX
, and, in addition, several new ones.
It should be noted that the use of
VEX
the prefix, along with some single-byte prefixes ( 0xf0
, 0x66
, 0xf2
, 0xf3
, REX
) is prohibited and leads to exclusion #UD
.EVEX Prefix
Not so long ago, Intel announced the appearance of a new extension to the instruction set with the name
AVX3
or AVX512
. With the advent of this extension, a new prefix has also appeared, called EVEX
. Its description can be found in the Intel Architecture Instruction Set Extensions Programming Reference . 
It is an improved version of the
VEX
prefix, it is already 4 bytes long and starts with a byte 0x62
, which in all modes except 64-bit corresponds to an instruction BOUND
rarely used in modern programs. Here are some, in my opinion, interesting features of the
EVEX
prefix:- Two bits for vector length -
LL`
- necessary to support vectors of 128, 256 and 512 bit sizes. - Support for addressing the new 512-bit registers
ZMM8
-ZMM31
. - Support mask register operands ( Eng. Opmask registers). The field
EVEX.aaa
. EVEX.mm
- field equivalentVEX.m-mmmm
, but takes two bits instead of five.
Conclusion
In conclusion, I would like to note some reasons for the appearance of such a complex and, in some places, not logical system of commands. The history of the development of the Intel IA-32 instruction system begins in the 70s of the last century, when there was no talk of any 64-bit modes. In addition to Intel, AMD made a significant contribution to the evolution of the IA-32. A lot of effort has been put into maintaining backward compatibility between different processor models. Many interesting facts related to the development of the IA-32 architecture can be found in the article by A. Fog .
Thanks to Atakua for commenting on the drafts of this article.
PS All illustrations are from Intel 64 and IA-32 Architectures Software Development Manuals .