Byte-machine for the fort (and not only) in Indian (Part 2)
Let's continue experiments with bytecode. This is a continuation of the article about the byte-machine in assembler, here is the first part .
In general, I planned in the second part to make the fort interpreter, and in the third - the fort compiler for this byte-machine. But the volume, which was obtained for the article, turned out to be very large. To make an interpreter, you need to expand the kernel (a set of byte commands), and implement: variables, string parsing, string input, dictionaries, dictionary lookup ... Well, at least the output of numbers should work. As a result, I decided to break the article about the interpreter into two. Therefore, in this article we will expand the core, we will define the variables, we will deduce the numbers. Further the approximate plan is as follows: the 3rd part is the interpreter, the 4th part is the compiler. And, of course, performance tests. They will be in the 4th or 5th article. These articles will be after the new year.
And who has not been afraid of the terrible assembler and byte-code - welcome under the cat! :)
To begin with, we will correct the errors. We assign the file extension .s, as is customary for GAS (thanks to mistergrim ). Then, replace int 0x80 with syscall and use 64-bit registers (thanks to qw1 ). In the beginning, I did not carefully read the call description and corrected only the registers ... and received a Segmentation fault. It turns out that everything changed for syscall, including the call numbers. sys_write for syscall is number 1, and sys_exit is 60. As a result, the bad, type and bye commands have the following form:
b_bad = 0x00
bcmd_bad: mov rax, 1 # системный вызов № 1 - sys_write
mov rdi, 1 # поток № 1 - stdout
mov rsi, offset msg_bad_byte # указатель на выводимую строку
mov rdx, msg_bad_byte_len # длина строки
syscall # вызов ядра
mov rax, 60 # системный вызов № 1 - sys_exit
mov rbx, 1 # выход с кодом 1
syscall # вызов ядра
b_bye = 0x01
bcmd_bye: mov rax, 1 # системный вызов № 1 - sys_write
mov rdi, 1 # поток № 1 - stdout
mov rsi, offset msg_bye # указатель на выводимую строку
mov rdx, msg_bye_len # длина строки
syscall # вызов ядра
mov rax, 60 # системный вызов № 60 - sys_exit
mov rdi, 0 # выход с кодом 0
syscall # вызов ядра
b_type = 0x80
bcmd_type: mov rax, 1 # системный вызов № 1 - sys_write
mov rdi, 1 # поток № 1 - stdout
pop rdx
pop rsi
push r8
syscall # вызов ядра
pop r8
jmp _next
And one moment. Quite rightly written in the comments to the last article of berez and fpauk , that if you use the processor address in the bytecode, the bytecode depends on the platform. And in that example, the string address for “Hello, world!” Was specified in bytecode by value (using the command lit64). Of course, this is not necessary. But it was the easiest way to test a byte machine. I will not do this anymore, but I will receive the addresses of variables by other means: in particular, with the var command (more on this later).
Warm up
And now, as a warm-up, we will do all the basic integer arithmetic operations (+, -, *, /, mod, / mod, abs). We need them.
The code is so simple that I quote it in the spoiler without comment.
Arithmetic
b_add = 0x21
bcmd_add: pop rax
add [rsp], rax
jmp _next
b_sub = 0x22
bcmd_sub: pop rax
sub [rsp], rax
jmp _next
b_mul = 0x23
bcmd_mul: pop rax
pop rbx
imul rbx
push rax
jmp _next
b_div = 0x24
bcmd_div: pop rbx
pop rax
cqo
idiv rbx
push rax
jmp _next
b_mod = 0x25
bcmd_mod: pop rbx
pop rax
cqo
idiv rbx
push rdx
jmp _next
b_divmod = 0x26
bcmd_divmod: pop rbx
pop rax
cqo
idiv rbx
push rdx
push rax
jmp _next
b_abs = 0x27
bcmd_abs: mov rax, [rsp]
or rax, rax
jge _next
neg rax
mov [rsp], rax
jmp _next
Traditionally, in a fort, double precision operations are added to ordinary arithmetic and stack operations. Words for such operations usually begin with the symbol “2”: 2DUP, 2SWAP, etc. But we have standard arithmetic already 64 digits, and 128 we will definitely not be doing today :)
Then we add basic stack operations (drop, swap, root, -root, over, pick, roll).
Stack operations
b_drop = 0x31
bcmd_drop: add rsp, 8
jmp _next
b_swap = 0x32
bcmd_swap: pop rax
pop rbx
push rax
push rbx
jmp _next
b_rot = 0x33
bcmd_rot: pop rax
pop rbx
pop rcx
push rbx
push rax
push rcx
jmp _next
b_mrot = 0x34
bcmd_mrot: pop rcx
pop rbx
pop rax
push rcx
push rax
push rbx
jmp _next
b_over = 0x35
bcmd_over: push [rsp + 8]
jmp _next
b_pick = 0x36
bcmd_pick: pop rcx
push [rsp + 8*rcx]
jmp _next
b_roll = 0x37
bcmd_roll: pop rcx
mov rbx, [rsp + 8*rcx]
roll1: mov rax, [rsp + 8*rcx - 8]
mov [rsp + 8*rcx], rax
dec rcx
jnz roll1
push rbx
jmp _next
And we will also make reading and writing commands in memory (fortov words @ and!). As well as their counterparts for another bit.
Reading and writing to memory
b_get = 0x40
bcmd_get: pop rcx
push [rcx]
jmp _next
b_set = 0x41
bcmd_set: pop rcx
pop rax
mov [rcx], rax
jmp _next
b_get8 = 0x42
bcmd_get8: pop rcx
movsx rax, byte ptr [rcx]
push rax
jmp _next
b_set8 = 0x43
bcmd_set8: pop rcx
pop rax
mov [rcx], al
jmp _next
b_get16 = 0x44
bcmd_get16: pop rcx
movsx rax, word ptr [rcx]
push rax
jmp _next
b_set16 = 0x45
bcmd_set16: pop rcx
pop rax
mov [rcx], ax
jmp _next
b_get32 = 0x46
bcmd_get32: pop rcx
movsx rax, dword ptr [rcx]
push rax
jmp _next
b_set32 = 0x47
bcmd_set32: pop rcx
pop rax
mov [rcx], eax
jmp _next
We may still need comparison commands, and we will do them.
Comparison commands
# 0=
b_zeq = 0x50
bcmd_zeq: pop rax
or rax, rax
jnz rfalse
rtrue: push -1
jmp _next
rfalse: push 0
jmp _next
# 0<
b_zlt = 0x51
bcmd_zlt: pop rax
or rax, rax
jl rtrue
push 0
jmp _next
# 0>
b_zgt = 0x52
bcmd_zgt: pop rax
or rax, rax
jg rtrue
push 0
jmp _next
# =
b_eq = 0x53
bcmd_eq: pop rbx
pop rax
cmp rax, rbx
jz rtrue
push 0
jmp _next
# <
b_lt = 0x54
bcmd_lt: pop rbx
pop rax
cmp rax, rbx
jl rtrue
push 0
jmp _next
# >
b_gt = 0x55
bcmd_gt: pop rbx
pop rax
cmp rax, rbx
jg rtrue
push 0
jmp _next
# <=
b_lteq = 0x56
bcmd_lteq: pop rbx
pop rax
cmp rax, rbx
jle rtrue
push 0
jmp _next
# >=
b_gteq = 0x57
bcmd_gteq: pop rbx
pop rax
cmp rax, rbx
jge rtrue
push 0
jmp _next
We will not test the operation. The main thing that the assembler would not give when compiling errors. Debugging will be in the process of using them.
Immediately make the word depth (stack depth). To do this, at the start, save the initial values of the data stack and the return stack. These values can still be useful when restarting the system.
init_stack: .quad 0
init_rstack: .quad 0
_start: mov rbp, rsp
sub rbp, stack_size
lea r8, start
mov init_stack, rsp
mov init_rstack, rbp
jmp _next
b_depth = 0x38
bcmd_depth: mov rax, init_stack
sub rax, rsp
shr rax, 3
push rax
jmp _next
Output numbers
Well, the warm-up is over, and you have to sweat a little. Let's teach our system to display numbers. To output numbers in the fort, use the word "." (point). We do it the way it is done in standard implementations of the fort, using the words <#, hold, #, #s, #>, base. We'll have to realize all these words. A buffer and a pointer to the character being formed are used to form the number; these will be the words holdbuf and holdpoint.
So, we need these words:
- holdbuf - a buffer for forming the representation of a number, the formation takes place from the end
- holdpoint - address to the last displayed character (in holdbuf)
- <# - the beginning of the formation of a number; sets holdpoint to byte, after last byte holdbuf
- hold - decreases holdpoint by 1 and saves the character from the stack to the buffer at the received address
- # - divides the word at the top of the stack into the base of the number system, the remainder of the division translates into a character and stores it into the buffer using hold
- #s - converts the entire word; actually calls the word # in a loop until 0 is left on the stack
- #> - completion of the conversion; pushes the beginning of the formed string and its length onto the stack
We will do all the words on the byte-code, but first we will deal with variables.
Variables
And here there will be some Fort magic. The fact is that in a fort a variable is a word. When executing this word, the stack contains the address of the memory cell that stores the value of the variable. At this address you can read or write. For example, to write the value 12345 in variable A, you need to execute the following commands: “12345 A!”. In this example, 12345 is pushed onto the stack, then the variable A puts its address, and the word "!" removes two values from the stack and writes 12345 to variable A. In typical implementations of a fort (with direct-stitched code), the variables are a microprocessor command CALL with the _next address, after which a space is reserved for storing the variable value. When executing such a word, the microprocessor transfers control to _next and pushes the return address (on the RSP) onto the stack. But in the forte the microprocessor stack is arithmetic, and we will not return anywhere. As a result, execution continues, and in the stack is the address of the variable. And all this is one processor team! On an assembler, it would look like this:
call _next # улетели на _next, а в стек попал адрес возврата, где лежит 12345
.quad 12345
But we have a byte code, and we can not use this mechanism! I did not immediately figure out how to make such a mechanism on bytecode. But, if you think logically, nothing prevents you from implementing something very similar. You just have to take into account that this will not be a processor command, but a bytecode, more precisely, a “subroutine” bytecode. Here is the statement of the problem:
- this is a byte code, when transferring control to which you should immediately return from it
- after return, in the arithmetic stack should remain the address where the value of the variable is stored
We have an exit byte command. We make a word on the bytecode containing a single exit command. Then this command will return from it. It remains to make the same command, which additionally puts on the stack the address of the next byte (register R8). Let's do this as an additional entry point to exit, which would save on the transition:
b_var0 = 0x28
bcmd_var0: push r8
b_exit = 0x17
bcmd_exit: mov r8, [rbp]
add rbp, 8
_next: movzx rcx, byte ptr [r8]
inc r8
jmp [bcmd + rcx*8]
Now the base variable will look like this:
base: .byte b_var0
.quad 10
By the way, why var0 and not just var? The fact is that there will be other commands for defining more advanced words that contain data. I will tell you more in the following articles.
Now we are ready to conclude numbers. Let's start!
Words base, holdbuf, holdpoint
How the variables will be arranged is already decided. Therefore, the words base, holdbuf, holdpoint are such:
base: .byte b_var0
.quad 10
holdbuf_len = 70
holdbuf: .byte b_var0
.space holdbuf_len
holdpoint: .byte b_var0
.quad 0
The size of the holdbuf buffer is 70. The maximum number of digits in a number is 64 (this is if you select a binary system). There is still a reserve of several characters to put, for example, a number sign and a space after it. We will check for buffer overflow, but for now let's not put extra characters into the buffer. Then you can make another diagnosis.
hold
Now you can make the word hold. On the fort, its code looks like this:
: hold holdpoint @ 1- dup holdbuf > if drop drop else dup holdpoint ! c! then ;
For those who see the fort for the first time, I will analyze the code in detail. For the following words I will not do this.
At the beginning there is a word for defining new words and the name of a new word: ": hold". After that comes the code, which ends with the word ";". Let's sort the code of the word. I will give the command and the state of the stack after the command is executed. Before calling a word on the stack, there is a character code that is placed in the buffer (indicated by <symbol>). Further it turns out so:
holdpoint <символ> <адрес переменной holdpoint>
@ <символ> <содержимое переменной holdpoint>
1- <символ> <содержимое переменной holdpoint минус 1>
dup <символ> <содержимое переменной holdpoint минус 1> <содержимое переменной holdpoint минус 1>
holdbuf <символ> <содержимое переменной holdpoint минус 1> <содержимое переменной holdpoint минус 1> <начало буфера holdbuf>
> <символ> <содержимое переменной holdpoint минус 1> <истина, если содержимое переменной holdpoint минус 1 больше начала буфера holdbuf>
After this is the if command, which is compiled into a conditional transition to a sequence of commands between else and then. A conditional transition removes the comparison result from the stack and performs the transition if there was a lie on the stack. If there was no transition, then there is a branch between if and else, in which there are two drop commands that remove the symbol and address. Otherwise, execution continues. The word "!" saves the new value to the holdpoint (the address and value are removed from the stack). And the word “c!” Writes the character to the buffer, this is the set8 byte-command (the address and the value of the character are removed from the stack).
dup <символ> <содержимое переменной holdpoint минус 1> <содержимое переменной holdpoint минус 1>
holdpoint <символ> <содержимое переменной holdpoint минус 1> <содержимое переменной holdpoint минус 1> <адрес переменной holdpoint>
! <символ> <содержимое переменной holdpoint минус 1>
c! все, символ записан, а стек пустой! :)
This is how much action this short sequence of commands does! Yes, the fort is laconic. And now we turn on the manual “compiler” in the head :) And compile it all into bytecode:
hold: .byte b_call8
.byte holdpoint - . - 1# holdpoint
.byte b_get # @
.byte b_wm # 1-
.byte b_dup # dup
.byte b_call8
.byte holdbuf - . - 1# holdbuf
.byte b_gt # >
.byte b_qbranch8 # if
.byte 0f - .
.byte b_drop # drop
.byte b_drop # drop
.byte b_branch8 # команда перехода на возврат (после then)
.byte 1f - .
0: .byte b_dup # dup
.byte b_call8
.byte holdpoint - . - 1# holdpoint
.byte b_set # !
.byte b_set8 # c!1: .byte b_exit # ;
Here I used local labels (0 and 1). These labels can be accessed by special names. For example, label 0 can be accessed by name 0f or 0b. This means a link to the nearest tag 0 (forward or backward). It is quite convenient for tags that are used locally, so as not to come up with different names.
Word #
Make the word #. On the fort, its code will look like this:
: # base /mod swap dup 10 < if c″ 0 + else 10 - c″ A + then hold ;
The condition here is used to check: is the resulting figure less than ten? If it is less, numbers 0–9 are used; otherwise, characters starting with “A” are used. This will allow you to work with the hexadecimal number system. The c ″ 0 sequence pushes the character 0 code onto the stack. We include the “compiler”:
conv: .byte b_call16
.word base - . - 2# base
.byte b_get # @
.byte b_divmod # /mod
.byte b_swap # swap
.byte b_dup # dup
.byte b_lit8
.byte 10 # 10
.byte b_lt # <
.byte b_qnbranch8 # if
.byte 0f - .
.byte b_lit8
.byte '0'# c″ 0
.byte b_add # +
.byte b_branch8 # else
.byte 1f - .
0: .byte b_lit8
.byte 'A'# c″ A
.byte b_add # +
1: .byte b_call16
.word hold - . - 2# hold
.byte b_exit # ;
Word <#
The word <# is quite simple:
: <# holdbuf 70 + holdpoint ! ;
Bytecode:
conv_start: .byte b_call16
.word holdbuf - . - 2
.byte b_lit8
.byte holdbuf_len
.byte b_add
.byte b_call16
.word holdpoint - . - 2
.byte b_set
.byte b_exit
Word #>
The word #> to complete the conversion looks like this:
: #> holdpoint @ holdbuf 70 + over - ;
Bytecode:
conv_end: .byte b_call16
.word holdpoint - . - 2
.byte b_get
.byte b_call16
.word holdbuf - . - 2
.byte b_lit8
.byte holdbuf_len
.byte b_add
.byte b_over
.byte b_sub
.byte b_exit
Word #s
And finally, the word #s:
: #s do # dup 0= until ;
Bytecode:
conv_s: .byte b_call8
.byte conv - . - 1
.byte b_dup
.byte b_qbranch8
.byte conv_s - .
.byte b_exit
Who is attentive, will notice here a slight discrepancy between the byte code and the code of the fort :)
All is ready
Now nothing will prevent the word "." From making a number:
: . <# #s drop #> type ;
Bytecode:
dot: .byte b_call8
.byte conv_start - . - 1
.byte b_call8
.byte conv_s - . - 1
.byte b_drop
.byte b_call8
.byte conv_end - . - 1
.byte b_type
.byte b_exit
Let's make a test byte code that checks our point:
start: .byte b_lit16
.word 1234
.byte b_call16
.word dot - . - 2
.byte b_bye
Of course, it did not work all at once. But, after debugging, the following result was obtained:
$ as forth.asm -o forth.o -g -ahlsm>list.txt
$ ld forth.o -o forth
$ ./forth
1234bye!
The jamb is visible immediately. After the number, the fort should display a space. Add after the call conv_start (<#) command 32 hold.
Still make a conclusion sign. At the beginning we add dup abs, and at the end we check the sign of the copy left and place a minus if the number is negative (0 <if c ″ - hold then). As a result, the word "." takes the following form:
: . dup abs <# 32 hold #s drop #> 0< if c″ - hold then type ;
Bytecode:
dot: .byte b_dup
.byte b_abs
.byte b_call8
.byte conv_start - . - 1
.byte b_lit8
.byte ' '
.byte b_call16
.word hold - . - 2
.byte b_call8
.byte conv_s - . - 1
.byte b_drop
.byte b_zlt
.byte b_qnbranch8
.byte 1f - .
.byte b_lit8
.byte '-'
.byte b_call16
.word hold - . - 21: .byte b_call8
.byte conv_end - . - 1
.byte b_type
.byte b_exit
In the starting sequence of byte commands, we put a negative number and check:
$ as forth.asm -o forth.o -g -ahlsm>list.txt
$ ld forth.o -o forth
$ ./forth
-1234 bye!
Output numbers there!
Full source
.intel_syntax noprefix
stack_size = 1024
.section .data
init_stack: .quad 0
init_rstack: .quad 0
msg_bad_byte:
.ascii "Bad byte code!\n"
msg_bad_byte_len = . - msg_bad_byte # символу len присваевается длина строки
msg_bye:
.ascii "bye!\n"
msg_bye_len = . - msg_bye
bcmd:
.quad bcmd_bad, bcmd_bye, bcmd_num0, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad # 0x00
.quad bcmd_lit8, bcmd_lit16, bcmd_lit32, bcmd_lit64, bcmd_call8, bcmd_call16, bcmd_call32, bcmd_bad
.quad bcmd_branch8, bcmd_branch16, bcmd_qbranch8, bcmd_qbranch16, bcmd_qnbranch8, bcmd_qnbranch16,bcmd_bad, bcmd_exit # 0x10
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_wm, bcmd_add, bcmd_sub, bcmd_mul, bcmd_div, bcmd_mod, bcmd_divmod, bcmd_abs # 0x20
.quad bcmd_var0, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_dup, bcmd_drop, bcmd_swap, bcmd_rot, bcmd_mrot, bcmd_over, bcmd_pick, bcmd_roll # 0x30
.quad bcmd_depth, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_get, bcmd_set, bcmd_get8, bcmd_set8, bcmd_get16, bcmd_set16, bcmd_get32, bcmd_set32 # 0x40
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_zeq, bcmd_zlt, bcmd_zgt, bcmd_eq, bcmd_lt, bcmd_gt, bcmd_lteq, bcmd_gteq #0x50
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad # 0x60
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_type, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad # 0x80
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
.quad bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad, bcmd_bad
start: .byte b_lit16
.word -1234
.byte b_call16
.word dot - . - 2
.byte b_bye
base: .byte b_var0
.quad 10
holdbuf_len = 70
holdbuf: .byte b_var0
.space holdbuf_len
holdpoint: .byte b_var0
.quad 0
# : hold holdpoint @ 1- dup holdbuf > if drop drop else dup holdpoint ! c! then ;
hold: .byte b_call8
.byte holdpoint - . - 1# holdpoint
.byte b_get # @
.byte b_wm # 1-
.byte b_dup # dup
.byte b_call8
.byte holdbuf - . - 1# holdbuf
.byte b_gt # >
.byte b_qbranch8 # if
.byte 0f - .
.byte b_drop # drop
.byte b_drop # drop
.byte b_branch8 # команда перехода на возврат (после then)
.byte 1f - .
0: .byte b_dup # dup
.byte b_call8
.byte holdpoint - . - 1# holdpoint
.byte b_set # !
.byte b_set8 # c!1: .byte b_exit # ;
# : # base /mod swap dup 10 < if c" 0 + else 10 - c" A + then hold ;
conv: .byte b_call16
.word base - . - 2# base
.byte b_get # @
.byte b_divmod # /mod
.byte b_swap # swap
.byte b_dup # dup
.byte b_lit8
.byte 10 # 10
.byte b_lt # <
.byte b_qnbranch8 # if
.byte 0f - .
.byte b_lit8
.byte '0'# c" 0
.byte b_add # +
.byte b_branch8 # else
.byte 1f - .
0: .byte b_lit8
.byte '?' # c" A
.byte b_add # +
1: .byte b_call16
.word hold - . - 2# hold
.byte b_exit # ;
# : <# holdbuf 70 + holdpoint ! ;
conv_start: .byte b_call16
.word holdbuf - . - 2
.byte b_lit8
.byte holdbuf_len
.byte b_add
.byte b_call16
.word holdpoint - . - 2
.byte b_set
.byte b_exit
# : #s do # dup 0=until ;
conv_s: .byte b_call8
.byte conv - . - 1
.byte b_dup
.byte b_qbranch8
.byte conv_s - .
.byte b_exit
# : #> holdpoint @ holdbuf 70 + over - ;
conv_end: .byte b_call16
.word holdpoint - . - 2
.byte b_get
.byte b_call16
.word holdbuf - . - 2
.byte b_lit8
.byte holdbuf_len
.byte b_add
.byte b_over
.byte b_sub
.byte b_exit
dot: .byte b_dup
.byte b_abs
.byte b_call8
.byte conv_start - . - 1
.byte b_lit8
.byte ' '
.byte b_call16
.word hold - . - 2
.byte b_call8
.byte conv_s - . - 1
.byte b_drop
.byte b_zlt
.byte b_qnbranch8
.byte 1f - .
.byte b_lit8
.byte '-'
.byte b_call16
.word hold - . - 21: .byte b_call8
.byte conv_end - . - 1
.byte b_type
.byte b_exit
.section .text
.global _start # точка входа в программу
_start: mov rbp, rsp
sub rbp, stack_size
lea r8, start
mov init_stack, rsp
mov init_rstack, rbp
jmp _next
b_var0 = 0x28
bcmd_var0: push r8
b_exit = 0x17
bcmd_exit: mov r8, [rbp]
add rbp, 8
_next: movzx rcx, byte ptr [r8]
inc r8
jmp [bcmd + rcx*8]
b_num0 = 0x02
bcmd_num0: push 0
jmp _next
b_lit8 = 0x08
bcmd_lit8: movsx rax, byte ptr [r8]
inc r8
push rax
jmp _next
b_lit16 = 0x09
bcmd_lit16: movsx rax, word ptr [r8]
add r8, 2
push rax
jmp _next
b_call8 = 0x0C
bcmd_call8: movsx rax, byte ptr [r8]
sub rbp, 8
inc r8
mov [rbp], r8
add r8, rax
jmp _next
b_call16 = 0x0D
bcmd_call16: movsx rax, word ptr [r8]
sub rbp, 8
add r8, 2
mov [rbp], r8
add r8, rax
jmp _next
b_call32 = 0x0E
bcmd_call32: movsx rax, dword ptr [r8]
sub rbp, 8
add r8, 4
mov [rbp], r8
add r8, rax
jmp _next
b_lit32 = 0x0A
bcmd_lit32: movsx rax, dword ptr [r8]
add r8, 4
push rax
jmp _next
b_lit64 = 0x0B
bcmd_lit64: mov rax, [r8]
add r8, 8
push rax
jmp _next
b_dup = 0x30
bcmd_dup: push [rsp]
jmp _next
b_wm = 0x20
bcmd_wm: decq [rsp]
jmp _next
b_add = 0x21
bcmd_add: pop rax
add [rsp], rax
jmp _next
b_sub = 0x22
bcmd_sub: pop rax
sub [rsp], rax
jmp _next
b_mul = 0x23
bcmd_mul: pop rax
pop rbx
imul rbx
push rax
jmp _next
b_div = 0x24
bcmd_div: pop rbx
pop rax
cqo
idiv rbx
push rax
jmp _next
b_mod = 0x25
bcmd_mod: pop rbx
pop rax
cqo
idiv rbx
push rdx
jmp _next
b_divmod = 0x26
bcmd_divmod: pop rbx
pop rax
cqo
idiv rbx
push rdx
push rax
jmp _next
b_abs = 0x27
bcmd_abs: mov rax, [rsp]
or rax, rax
jge _next
neg rax
mov [rsp], rax
jmp _next
b_drop = 0x31
bcmd_drop: add rsp, 8
jmp _next
b_swap = 0x32
bcmd_swap: pop rax
pop rbx
push rax
push rbx
jmp _next
b_rot = 0x33
bcmd_rot: pop rax
pop rbx
pop rcx
push rbx
push rax
push rcx
jmp _next
b_mrot = 0x34
bcmd_mrot: pop rcx
pop rbx
pop rax
push rcx
push rax
push rbx
jmp _next
b_over = 0x35
bcmd_over: push [rsp + 8]
jmp _next
b_pick = 0x36
bcmd_pick: pop rcx
push [rsp + 8*rcx]
jmp _next
b_roll = 0x37
bcmd_roll: pop rcx
mov rbx, [rsp + 8*rcx]
roll1: mov rax, [rsp + 8*rcx - 8]
mov [rsp + 8*rcx], rax
dec rcx
jnz roll1
push rbx
jmp _next
b_depth = 0x38
bcmd_depth: mov rax, init_stack
sub rax, rsp
shr rax, 3
push rax
jmp _next
b_get = 0x40
bcmd_get: pop rcx
push [rcx]
jmp _next
b_set = 0x41
bcmd_set: pop rcx
pop rax
mov [rcx], rax
jmp _next
b_get8 = 0x42
bcmd_get8: pop rcx
movsx rax, byte ptr [rcx]
push rax
jmp _next
b_set8 = 0x43
bcmd_set8: pop rcx
pop rax
mov [rcx], al
jmp _next
b_get16 = 0x44
bcmd_get16: pop rcx
movsx rax, word ptr [rcx]
push rax
jmp _next
b_set16 = 0x45
bcmd_set16: pop rcx
pop rax
mov [rcx], ax
jmp _next
b_get32 = 0x46
bcmd_get32: pop rcx
movsx rax, dword ptr [rcx]
push rax
jmp _next
b_set32 = 0x47
bcmd_set32: pop rcx
pop rax
mov [rcx], eax
jmp _next
# 0=
b_zeq = 0x50
bcmd_zeq: pop rax
or rax, rax
jnz rfalse
rtrue: push -1
jmp _next
rfalse: push 0
jmp _next
# 0<
b_zlt = 0x51
bcmd_zlt: pop rax
or rax, rax
jl rtrue
push 0
jmp _next
# 0>
b_zgt = 0x52
bcmd_zgt: pop rax
or rax, rax
jg rtrue
push 0
jmp _next
# =
b_eq = 0x53
bcmd_eq: pop rbx
pop rax
cmp rax, rbx
jz rtrue
push 0
jmp _next
# <
b_lt = 0x54
bcmd_lt: pop rbx
pop rax
cmp rax, rbx
jl rtrue
push 0
jmp _next
# >
b_gt = 0x55
bcmd_gt: pop rbx
pop rax
cmp rax, rbx
jg rtrue
push 0
jmp _next
# <=
b_lteq = 0x56
bcmd_lteq: pop rbx
pop rax
cmp rax, rbx
jle rtrue
push 0
jmp _next
# >=
b_gteq = 0x57
bcmd_gteq: pop rbx
pop rax
cmp rax, rbx
jge rtrue
push 0
jmp _next
b_branch8 = 0x10
bcmd_branch8: movsx rax, byte ptr [r8]
add r8, rax
jmp _next
b_branch16 = 0x11
bcmd_branch16: movsx rax, word ptr [r8]
add r8, rax
jmp _next
b_qbranch8 = 0x12
bcmd_qbranch8: pop rax
or rax, rax
jnz bcmd_branch8
inc r8
jmp _next
b_qbranch16 = 0x13
bcmd_qbranch16: pop rax
or rax, rax
jnz bcmd_branch16
add r8, 2
jmp _next
b_qnbranch8 = 0x14
bcmd_qnbranch8: pop rax
or rax, rax
jz bcmd_branch8
inc r8
jmp _next
b_qnbranch16 = 0x15
bcmd_qnbranch16:pop rax
or rax, rax
jz bcmd_branch16
add r8, 2
jmp _next
b_bad = 0x00
bcmd_bad: mov rax, 1 # системный вызов № 1 - sys_write
mov rdi, 1 # поток № 1 — stdout
mov rsi, offset msg_bad_byte # указатель на выводимую строку
mov rdx, msg_bad_byte_len # длина строки
syscall # вызов ядра
mov rax, 60 # системный вызов № 1 - sys_exit
mov rbx, 1 # выход с кодом 1
syscall # вызов ядра
b_bye = 0x01
bcmd_bye: mov rax, 1 # системный вызов № 1 - sys_write
mov rdi, 1 # поток № 1 — stdout
mov rsi, offset msg_bye # указатель на выводимую строку
mov rdx, msg_bye_len # длина строки
syscall # вызов ядра
mov rax, 60 # системный вызов № 60 - sys_exit
mov rdi, 0 # выход с кодом 0
syscall # вызов ядра
b_type = 0x80
bcmd_type: mov rax, 1 # системный вызов № 1 - sys_write
mov rdi, 1 # поток № 1 - stdout
pop rdx
pop rsi
push r8
syscall # вызов ядра
pop r8
jmp _next
Total
Now we have a pretty decent core of byte commands: all basic arithmetic, stack operations, comparison operations, work with memory, variables. Also, there is already an output of numbers, fully implemented in bytecode. Everything is ready, what would the interpreter do, which we will do in the next article!
Happy New Year, everyone!
Criticism is welcome! :)
Continued: Byte-machine for the fort (and not only) in Indian (part 3)