The Way to Understanding V8 Bytecode
- Transfer
- Tutorial
V8 is Google's open source JavaScript engine. It is used by Chrome, Node.js and many other applications. This material, written by Google employee Francisco Hinkelmann, is dedicated to describing the V8 bytecode format. Bytecode is pretty easy to read if you understand some basic things.
Ignition! Start! The Ignition interpreter, whose name can be translated as “ignition”, has been part of the V8 compilation pipeline since 2016.
When V8 compiles JavaScript code, the parser generates an abstract syntax tree. The syntax tree is a tree view of the syntax structure of the JS code. The Ignition interpreter generates bytecode from this data structure. The optimizing TurboFan compiler ultimately generates optimized machine code from the bytecode.
V8 Compilation Pipeline
If you want to know why V8 has two execution modes, take a look at my presentation with JSConfEU.
A bytecode is an abstraction of machine code . Compiling a bytecode into machine code is easier if the bytecode is designed using the same computational model that is used in the physical processor. That is why interpreters are often register or stack machines.
The Ignition interpreter is a register machine with a cumulative register .
The code on the left is convenient for people. Code on the right - for machines,
V8 bytecodes can be thought of as small building blocks that, put together, can implement any JavaScript functionality. V8 has several hundred bytecodes. There are codes for operators, like
Each bytecode defines its input and output data as register operands. Ignition uses registers
Many bytecode names begin with
For example, the command
Now, after we have examined the basic concepts, let's look at the bytecode of a real function.
If you want to see the bytecode for the JavaScript code, you can display it by calling the D8 debugger or Node.js (starting with version 8.3) with the flag
We can ignore a considerable part of this data, focusing on byte codes. Here is a description of what we see here.
LdaSmi [1]
The command
Star r0
The command
LdaNamedProperty a0, [0], [4]
The command
What is the operand with a digit used for
Now the contents of the registers are as follows.
Add r0, [6]
The last instruction adds content
Return
The command
Please note that the bytecode that this material is dedicated to is used in V8 version 6.2, in Chrome 62 and in the not yet released Node 9. We, at Google, are constantly working on V8 in the direction of improving performance and reducing memory consumption. In other versions of V8, there may be some differences in the bytecode from what was described here.
At first glance, the V8 bytecode may seem rather cryptic, especially when it is displayed with a ton of additional information. However, as soon as you find out that Ignition is a register machine with an accumulative register, you can understand the purpose of most bytecodes.
Dear readers! Are you planning to analyze the bytecode of your JS programs?
V8 compilation pipeline
Ignition! Start! The Ignition interpreter, whose name can be translated as “ignition”, has been part of the V8 compilation pipeline since 2016.
When V8 compiles JavaScript code, the parser generates an abstract syntax tree. The syntax tree is a tree view of the syntax structure of the JS code. The Ignition interpreter generates bytecode from this data structure. The optimizing TurboFan compiler ultimately generates optimized machine code from the bytecode.
V8 Compilation Pipeline
If you want to know why V8 has two execution modes, take a look at my presentation with JSConfEU.
V8 Bytecode Basics
A bytecode is an abstraction of machine code . Compiling a bytecode into machine code is easier if the bytecode is designed using the same computational model that is used in the physical processor. That is why interpreters are often register or stack machines.
The Ignition interpreter is a register machine with a cumulative register .
The code on the left is convenient for people. Code on the right - for machines,
V8 bytecodes can be thought of as small building blocks that, put together, can implement any JavaScript functionality. V8 has several hundred bytecodes. There are codes for operators, like
Add
or TypeOf
, or for loading properties - sort of LdaNamedProperty
. V8 also has some fairly specific bytecodes, such as CreateObjectLiteral
or SuspendGenerator
. In the bytecodes.h header file, you can find a complete list of V8 bytecodes. Each bytecode defines its input and output data as register operands. Ignition uses registers
r0, r1, r2, ...
and cumulative register. Almost all bytecodes use a memory register. It is similar to regular case, except that it is not explicitly indicated in bytecodes. For example, a command Add r1
adds a value from a register r1
to what is stored in a cumulative register. This makes bytecodes shorter and saves memory. Many bytecode names begin with
Lda
or Sta
. The letter a
in Lda
and Sta
is an abbreviation of the word a ccumulator (cumulative register). For example, the command
LdaSmi [42]
loads a small integer (Small Integer, Smi) 42
into the accumulative register. The command Star r0
writes a value that is in the accumulation register to the register r0
.Function Byte Code Analysis
Now, after we have examined the basic concepts, let's look at the bytecode of a real function.
function incrementX(obj) {
return 1 + obj.x;
}
incrementX({x: 42}); // Компилятор V8 ленив, поэтому, если вы не вызовете функцию, он не будет её интерпретировать
If you want to see the bytecode for the JavaScript code, you can display it by calling the D8 debugger or Node.js (starting with version 8.3) with the flag
--print-bytecode
. In the case of Chrome - run it from the command line with the key --js-flags="--print-bytecode"
. Here is the Chromium key call stuff .$ node --print-bytecode incrementX.js
...
[generating bytecode for function: incrementX]
Parameter count 2
Frame size 8
12 E> 0x2ddf8802cf6e @ StackCheck
19 S> 0x2ddf8802cf6f @ LdaSmi [1]
0x2ddf8802cf71 @ Star r0
34 E> 0x2ddf8802cf73 @ LdaNamedProperty a0, [0], [4]
28 E> 0x2ddf8802cf77 @ Add r0, [6]
36 S> 0x2ddf8802cf7a @ Return
Constant pool (size = 1)
0x2ddf8802cf21: [FixedArray] in OldSpace
- map = 0x2ddfb2d02309
We can ignore a considerable part of this data, focusing on byte codes. Here is a description of what we see here.
LdaSmi [1]
The command
LdaSmi [1]
loads a constant 1
into the accumulation register.Star r0
The command
Star r0
writes the value in the accumulative register, that is 1
, in the register r0
.LdaNamedProperty a0, [0], [4]
The command
LdaNamedProperty
loads the named property a0
into the cumulative register. The construction ai
refers to the i
ith argument of the function incrementX()
. In this example, we access the named property at the address a0
, that is, the first argument incrementX()
. The name is determined by a constant 0
. LdaNamedProperty
uses 0
to search for a name in a separate table:- length: 1
0: 0x2ddf8db91611
0
Displayed
here on x
. In the end, it turns out that this bytecode is loading obj.x
. What is the operand with a digit used for
4
? This index is the so-called feedback vector (feedback vector) function increment(x)
. The feedback vector contains runtime information that is used to optimize performance. Now the contents of the registers are as follows.
Add r0, [6]
The last instruction adds content
r0
to the accumulative register, which results in a final value 43
. A number 6 —
is another index of the feedback vector.Return
The command
Return
returns the contents of the accumulation register. This is the completion of the function incrementX()
. What caused incrementX()
it starts working with a number 43
in the accumulative register and can continue to perform certain actions with this value. Please note that the bytecode that this material is dedicated to is used in V8 version 6.2, in Chrome 62 and in the not yet released Node 9. We, at Google, are constantly working on V8 in the direction of improving performance and reducing memory consumption. In other versions of V8, there may be some differences in the bytecode from what was described here.
Summary
At first glance, the V8 bytecode may seem rather cryptic, especially when it is displayed with a ton of additional information. However, as soon as you find out that Ignition is a register machine with an accumulative register, you can understand the purpose of most bytecodes.
Dear readers! Are you planning to analyze the bytecode of your JS programs?