Operating systems from scratch; level 3 (younger half)

  • Tutorial

In this lab we will implement the ability to run custom programs. Those. processes and all dependent infrastructure. In the beginning, we will figure out how to switch from privileged code, how to switch process contexts. Then we implement a simple round-robin scheduler, system calls and virtual memory management. In the end, we will remove our shell from the kernel space to the user space.


original


Zero Lab


First Lab: Younger Half and Older Half


The second lab: the younger half and the older half


Usefulness



Phase 0: Getting Started


As in the previous parts, guaranteed work requires:


  • A machine with modern Unix: Linux, BSD or macOS.
  • 64-bit OS.
  • The presence of a USB port.
  • Installed software from previous releases.

Code retrieval


There 3-spawnis nothing in the turnip except questions, but no one bothers to sneak:


git clone https://web.stanford.edu/class/cs140e/assignments/3-spawn/skeleton.git 3-spawn

After that, because it is useless, the directory structure should look something like this:


cs140e
├── 0-blinky
├── 1-shell
├── 2-fs
├── 3-spawn
└── os

But inside the osrepa switch to the branch 3-spawnwill still be necessary:


cd os
git fetch
git checkout 3-spawn
git merge 2-fs

Most likely you will again see merge conflicts. Something like this:


Auto-merging kernel/src/kmain.rs
CONFLICT (content): Merge conflictin kernel/src/kmain.rs
Automatic merge failed; fix conflicts andthencommit the result.

Merge conflicts will need to be resolved manually by modifying the file kmain.rs. In this case, you need to make sure that you saved all your changes from Lab 2. After resolving the conflicts, add the files git addand commit it all. In order to get more information on this topic - see the tutorial on githowto.com .


ARM Documentation


In this assignment, we will constantly refer to the three official ARM documents. These three are:


  1. ARMv8 Reference Manual
    This is the official ARMv8 architecture reference guide. A one-stop guide that covers the entire architecture in its entirety. For a specific implementation of this architecture in the process of raspberry, we need manual No. 2. We will refer to sections of this large ARMv8 manual by means of notes of the form ( ref : C5.2). In this case, this means that you need to look at the ARMv8 Reference Manual in section C5.2.
  2. ARM Cortex-A53 Manual
    This is a manual for a very specific implementation of ARMv8 (v8.0-A), which is used in the robin. We will refer to this manual with notes of the form ( A53 : 4.3.30).
  3. ARMv8-A Programmer Guide
    Now we have a fairly high-level ARMv8-A programming manual. We will refer to it with notes of the form ( guide : 10.1)

I highly recommend downloading these manuals to your disk. So it will be easier to open them every time. Especially the first one because it is very, very large. Speaking of that.


How to read it at all? We do not need to read it in its entirety. Therefore, for starters, it is extremely important to know what we want to find in this manual. This manual has a good usable structure. It is divided into several parts. We are interested in AArch64 and are not interested in diving too deep (we are not processor manufacturers). So we are not interested in many chapters from the word at all. In fact, parts A, B, and some information from C and D are enough for us. The first two parts describe general concepts in relation to architecture and to AArch64 in particular. Part C describes a set of instructions. We will use this part as a reference for the most basic instructions and registers (for example, SIMD does not interest us now). Part D describes some of the details of AArch64. In particular, about interrupts and all that.


Phase 1: ARM and a Leg (Arm and Leg)


In this phase, we will study the ARMv8 architecture, switch to a less privileged level, configure processor exception vectors, handle timer interrupt and breakpoint interrupt. Let's examine the exception levels in the ARM architecture. We are mainly interested in how to catch these very exceptions and interruptions.


Subphase A: ARMv8 Review


In this subphase, we will study the architecture of ARMv8. Here we will not write any code, but there are questions for self-testing.


ARM (Acron RISC Machine) is a microprocessor architecture with more than 30 years of history. There are currently eight versions of this architecture. The latest ARMv8 was introduced in 2011. Broadcom's BCM2837 chip contains ARM Cortex-A53 cores, which are ARMv8.0-based cores. Cortex-A53 (and the like) is an implementation of the architecture. And this is the implementation that we will study in all this part.


ARM microprocessors dominate the mobile market.

ARM is about 95% of the global smartphone market and 100% of flagship smartphones. Including Apple iPhone or Google Pixel.

So far, we have been trying to avoid processor architecture. Rust did everything for us. In order for us to operate processes in user space, we will need to conduct a certain amount of work at a low level. Programming on a process will directly require familiarization with the assembler of this architecture and with all related concepts around it. We will start with a review of the architecture and deal with the most basic assembly instructions.


Registers


The ARMv8 architecture has the following registers ( ref : D1.2.1):


  • r0... r30- 64-bit general purpose registers. Access to registers is carried out by pseudonyms (aliases). Registers x0... x30are aliases for the 64-bit version (i.e. full). Still there aliases w0... w30. The latter access the lower 32 bits of the register.
  • lr- 64-bit reference register. Alias ​​for x30. Used to store the transition address. The instruction bl <addr>saves the current command counter (PC) in lrand goes to the address addr. The reverse work will be done by the instruction ret. She will take the address from lrand assign it to the PC.
  • sp- stack pointer. The lower 32 bits are available by alias wsp. The stack pointer should always be aligned by 16 bytes.
  • pc- software counter. This register cannot be written directly, but can be read. It is updated on transition instructions, when interrupts are called, when returned.
  • v0... v31- 128-bit SIMD and FP registers. These are used for vector SIMD operations and for floating point operations. These registers are available by alias. q0... q31- aliases for all 128 bits of the register. Registers d0... d31these are the lower 64 bits. In addition to this, there are aliases for the lower 32, 16 and 8 bits by prefixes s, hand baccordingly.
  • xzr- zero case. This is a pseudo-register, which may or may not be a hardware register. Always contains 0. This register can only be read.

There are many more special purpose registers . We will talk about them a little later.


Pstate


At any point in time, percent ARMv8 makes it possible to access the state of the program through a pseudo-register named PSTATE ( ref : D1.7). This is not an ordinary register. It cannot be read or written to it directly. Instead, there are several special-purpose registers that can be used to operate on parts of the PSTATE pseudo-register. On ARMv8.0, this is:


  • NZCV - status flags
  • DAIF - a bit mask of exceptions, which is used to enable and disable these very exceptions
  • CurrentEL - current level of exceptions (to be described later)
  • SPSel - stack pointer selector (there are actually several)

Such registers belong to the class of system or special registers ( ref : C5.2). Regular registers can be read from RAM with ldror written to memory with str. System registers cannot be used like that. Instead, special commands mrsand are required msr( ref : C6.2.162 - C6.2.164). For example, in order to read NZCVin x1us, you should use the following record:


mrs x1, NZCV

Execution status


At any given time ARMv8 percent satisfied with a particular implementation of the state (execution state). In total there are exactly two such states. AArch32 - compatibility mode with 32-bit ARMv7. And AArch64 - 64-bit ARMv8 mode ( guide : 3.1). We will work only with AArch64.


Safe mode


At any time, our percents executed with a certain security condition (security state) (guide: 3). This garbage can also be searched by security mode or by security world. Only two states: secure and non-secure . Those. safe and normal. We will work entirely in normal mode.


Exception levels


In addition to this, there are also exception levels ( guide : 3). Each exception level corresponds to a specific privilege level. The higher the exception level, the more privileges a program running at that level will receive. There are 4 levels in total:


  • EL0 (user) - Usually used to run custom programs.
  • EL1 (kernel) - Privileged mode. Usually, the kernel of the operating system is launched here.
  • EL2 (hypervisor) - Typically used to run virtual machine hypervisors.
  • EL3 (monitor) - Commonly used for low-level firmware.

The Raspberry Pi processor boots into EL3. At this point, the firmware provided by the Raspberry Pi Foundation is launched. The firmware switches the processor to EL2 and launches our file kernel8.img. Thus, our kernel starts from the EL2 level. A little later, we will switch from EL2 to EL1, so that our kernel works at the appropriate level of exceptions.


ELx registers


A number of system registers, such as ELR, SPSRand SP, are duplicated for each level of exceptions. At the same time, a suffix is ​​put to their names _ELn, where nis the level of exceptions to which this register refers. For example, it ELR_EL1is an exception reference register for the EL1 level, but the ELR_EL2same, but for the EL2 level.


We will use the suffix x(for example, in ELR_ELx) when it is necessary to refer to the register from the target exception level x. The target exception level is the exception level to which the CPU will switch (if necessary) when the exception vector is started.


We will use the suffix s(for example SP_ELs, when it is necessary to refer to the register in the initial exception level s. The initial exception level is the exception level at which the CPU was executed before the exception occurred.


Switch between exception levels


There is exactly one mechanism for increasing the level of exclusion and exactly one mechanism for reducing the level of exclusion.


To switch from a higher level to a lower level (reduction of privileges), a running program must comply with the return (return statement) from this exemption level with the command eret( ref : D1.11). When executing a command eretfor the ELxprocessor level :


  • Set PC to the value from the special register ELR_ELx.
  • Set PSTATE to a value from special register SPSR_ELx.

The register SPSR_ELx( ref : C5.2.18), among other things, contains the level of exceptions to which you must go. In addition, it is worth paying attention to the following additional consequences of changing exception levels:


  • When you return to ELs, it is spset to SP_ELsif SPSR_ELx[0] == 1or SP_EL0if SPSR_ELx[0] == 0.

The transition from a lower level to a higher one occurs only as a result of exclusion ( guide : 10). Unless otherwise configured, the percent will catch exceptions for the next level. For example, if an interrupt is received during operation in EL0, the percent will switch to EL1 to handle the exception. When switching to ELxpercent, it will do the following:


  • Disabled (disguise) all exceptions and interrupts: PSTATE.DAIF = 0b1111.
  • Save PSTATEand everything in SPSR_ELx.
  • Save the return address to ELR_ELx( ref : D1.10.1).
  • Set spto SP_ELxif SPSelequals 1.
  • Set exclusion syndrome (we will describe this later) in ESR_ELx( ref : D1.10.4).
  • Set pcto the address corresponding to the exclusion vector (we will describe a bit later).

Note that the exception syndrome register is only valid for synchronous exceptions. All general purpose registers and SIMD / FP registers will contain the values ​​that they had when an exception occurred.


Exception Vectors


When exceptions occur, the CPU transfers control to the place where the exception vector is located ( ref : D1.10.2). There are 4 types of exceptions, each of which contains 4 possible sources of exceptions. Those. a total of 16 exception vectors. Here are four types of exceptions:


  • Synchronous - exceptions caused by type svcor instructions brk. Well, in general, for any events in which the programmer is guilty.
  • IRQ - asynchronous interrupts from external sources.
  • FIQ - asynchronous interrupts from external sources. Version for quick processing.
  • SError - interruptions of type "system error".

Here are four sources of interrupts:


  • Current exception level for SP = SP_EL0
  • Current exception level for SP = SP_ELx
  • Lower exception level at which AArch64 is executed
  • Lower exception level at which AArch32 runs

From the description of the manual ( guide : 10.4):


When an exception occurs, the processor must execute handler code that matches the exception. The place in memory where the [exception] handler is stored is called the exception vector. In the ARM architecture, exception vectors are stored in a table called an exception vector table. Each exception level has its own vector table, that is, for each of EL3, EL2 and EL1. The table contains instructions for execution, not a set of addresses [as in x86]. Each entry in the vector table has a size of 16 instructions. Vectors for individual exceptions are located with fixed offsets from the beginning of the table. The virtual address of each table is based on [special] vector address registers VBAR_EL3, VBAR_EL2and VBAR_EL1.

These vectors are physically located in memory as follows:


Current exception level for SP = SP_EL0


Offset from VBAR_ELxAn exception
0x000Synchronous exception
0x080IRQ
0x100FIQ
0x180Serorror

Current exception level for SP = SP_ELx


Offset from VBAR_ELxAn exception
0x200Synchronous exception
0x280IRQ
0x300FIQ
0x380Serorror

Lower exception level at which AArch64 is executed


Offset from VBAR_ELxAn exception
0x400Synchronous exception
0x480IRQ
0x500FIQ
0x580Serorror

Lower exception level at which AArch32 runs


Offset from VBAR_ELxAn exception
0x600Synchronous exception
0x680IRQ
0x700FIQ
0x780Serorror

Summary


For now, this is all we need to know about the ARMv8 architecture. Before continuing, try to answer these questions. For self-testing.


What are the aliases of the register x30? [arm-x30]

If we write 0xFFFFto the register x30, then what two other names of this register can we use to extract this value?

How can I change the PC value to a specific address? [arm-pc]

How can I install a PC to an address Ausing the instructions ret? How to set the PC to the address Ausing the instructions eret? Indicate which registers you will change in order to achieve this.

How can I determine the current level of exceptions? [arm-el]

What specific instructions would you follow to determine the current level of exclusion?

How would you change the stack pointer to throw an exception?[arm-sp-el]

The stack pointer of the running program is equal Aat the time of the exception. After handling the exception, you want to go back to where the program was running, but you want to change the stack pointer to B. How do you do that?

Which vector is used for system calls from a lower EL? [arm-svc] The

user process runs on EL0. This process is causing svc. What address will the management be transferred to?

Which vector is used for interrupts from the lower EL? [arm-int] The

user process runs on EL0. At this point, a timer interrupt occurs. What address will the management be transferred to?

How can I enable IRQ exception handling?[arm-mask]

In which register what values ​​should be written in order to unlock IRQ interrupts?

How would you use eretAArch32 to enable it? [arm-aarch32]

The exception source is AArch64. The handler for this exception is also on AArch64. What values ​​in which registers would you change so that when you return from the exception through the eretpercent switch to run mode AArch32?
Hint : watch ( guide : 10.1)

Subphase B: Assembler Instructions



In this subphase, we will learn the most basic commands from the ARMv8 command set. We will not write the code right now, but there are a couple of questions for self-testing.


Memory access


ARMv8 is a set of instructions for loading / storing RISC (a computer with a reduced set of instructions). The defining feature of this set of instructions is the small fact that memory access can only be achieved through clearly defined instructions. In particular, memory can only be read by reading into the register with a load instruction, and written only by a save instruction.


There are many instructions for loading / unloading (load / store) in various variations (for the most part they are of the same type). Let's start with the simplest form:


  • ldr <ra>, [<rb>]: loads the value from the address <rb>in <ra>.
  • str <ra>, [<rb>]: saves the value <ra>to the address of <rb>.

The register <rb>is called the base register . For example, if r3 = 0x1234, then:


ldr r0, [r3]      // r0 = *r3 (то есть, r0 = *(0x1234))str r0, [r3]      // *r3 = r0 (то есть, *(0x1234) = r0)

In addition, you can add an offset from the gap [-256, 255]:


ldr r0, [r3, #64]      // r0 = *(r3 + 64)
str r0, [r3, #-12]     // *(r3 - 12) = r0

You can also specify a post-index that will change the value in the base case after applying the load or save:


ldr r0, [r3], #30      // r0 = *r3; r3 += 30
str r0, [r3], #-12     // *r3 = r0; r3 -= 12

Or a pre-index that changes the value in the base register before applying load or save:


ldr r0, [r3, #30]!     // r3 += 30; r0 = *r3
str r0, [r3, #-12]!    // r3 -= 12; *r3 = r0

Offset, post-index and pre-index, they are known as addressing modes .


In addition, there is also a team that can load / unload two registers at once. Instructions ldpand stp(load pair, store pair). These instructions can be used with the same addressing modes as ldrand str.


// кладём `x0` и `x1` на стек. после этой операции стек будет:////   |------| <x (оригинальный SP)//   |  x1  |//   |------|//   |  x0  |//   |------| <- SP//
stp x0, x1, [SP, #-16]!// вынимаем `x0` и `x1` со стека. после этой операции стек будет:////   |------| <- SP//   |  x1  |//   |------|//   |  x0  |//   |------| <x (original SP)//
ldp x0, x1, [SP], #16// эти четыре операции выполняют то же самое, что и предыдущие две
sub SP, SP, #16
stp x0, x1, [SP]
ldp x0, x1, [SP]
add SP, SP, #16// Всё тоже самое, но уже для четырёх регистров x0, x1, x2, и x3.
sub SP, SP, #32
stp x0, x1, [SP]
stp x2, x3, [SP, #16]
ldp x0, x1, [SP]
ldp x2, x3, [SP, #16]add SP, SP, #32

Direct loading of values


The immediate value is another name for an integer whose value is known without any calculation. In order to load (for example) 16 bits of immediate into the register, optionally shifting it a certain number of bits to the left, we need a command mov(move). In order to load the same 16 bits with a shift, but without replacing the remaining bits, we need movk(move / keep). Here is an example of using all of this:


mov   x0, #0xABCD, LSL #32// x0 = 0xABCD00000000
mov   x0, #0x1234, LSL #16// x0 = 0x12340000
mov   x1, #0xBEEF           // x1 = 0xBEEF
movk  x1, #0xDEAD, LSL #16// x1 = 0xDEADBEEF
movk  x1, #0xF00D, LSL #32// x1 = 0xF00DDEADBEEF
movk  x1, #0xFEED, LSL #48// x1 = 0xFEEDF00DDEADBEEF

Note that the loaded values ​​themselves are prefixed #. LSLwhile everything means a shift to the left.


Only 16 bits with an optional offset can be loaded into the register. By the way, the assembler can in many cases determine the necessary shift itself. For example, automatically replace mov x12, #(1 << 21)with mov x12, 0x20, LSL #16.


Loading addresses from tags


Assembler sections can be marked with labels in the form <label>::


add_30:
    add x1, x1, #10add x1, x1, #20

In order to load the address of the first instruction after the label, you can use the instructions adror ldr:


adr x0, add_30    // x0 = адрес первой инструкции после add_30
ldr x0, =add_30   // x0 = адрес первой инструкции после add_30

You should use ldrif the label is not in the same linker section. Otherwise should be used adr.


Moving data between registers


In order to move data between registers, you should use the instructions already familiar to us mov:


mov  x13, #23    //          x13 = 23
mov  sp, x13     // sp = 23, x13 = 23

Work with special registers


Special and system registers seem ELR_EL1to be written / read only through general purpose registers and only using special instructions mrsand msr.


In order to write to the special register you need to use msr:


msr ELR_EL1, x1  // ELR_EL1 = x1

To read from the special register use mrs:


mrs x0, CurrentEL // x0 = CurrentEL

Arithmetic


For the simplest arithmetic operations, at the moment we will have enough instructions addand sub:


add <dest><a><b> // dest = a + b
sub <dest><a><b> // dest = a - b

For example:


mov x2, #24
mov x3, #36add x1, x2, x3  // x1 = 24 + 36 = 60
sub x4, x3, x2  // x4 = 36 - 24 = 12

In this case, instead of the parameter, <b>you can use the immediate value:


sub sp, sp, #120 // sp -= 120
add x3, x1, #120 // x3 = x1 + 120
add x3, x3, #88  // x3 += 88

Logical instructions


Instructions andand orrare used for bitwise operations ANDand OR. Equivalent addand sub:


mov x1, 0b11001
mov x2, 0b10101
and x3, x1, x2  // x3 = x1 & x2 = 0b10001
orr x3, x1, x2  // x3 = x1 | x2 = 0b11101
orr x1, x1, x2  // x1 |= x2and x2, x2, x1  // x2 &= x1and x1, x1, #0b110  // x1 &= 0b110
orr x1, x1, #0b101  // x1 |= 0b101

Branching


Branching is another term for going to an address. It changes the PC to the transmitted address or to the label address. In order to go without conditions to any label, the instruction is used b:


b label // jump tolabel

To go to the label when saving the next address in the link registry ( lr), use bl. The command retjumps to the address from lr:


my_function:
    add x0, x0, x1
    ret
mov  x0, #4
mov  x1, #30
bl   my_function  // lr = адрес инструкции `mov x3, x0`
mov  x3, x0       // x3 = x0 = 4 + 30 = 34

Commands brand are blrsimilar band blrespectively, but go to the address contained in the register:


ldr  x0, =label
blr  x0          // идентично bl label
br   x0          // идентично b  label

Conditional branching


The instruction cmpcan be used to compare two registers or a register and a value. It sets all the necessary flags for subsequent application of such instructions as bne(branch not equal), beq(branch if equal), blt(branch if less than), etc. ( ref : C1.2.4)


// добавлять 1 к x0 до тех пор, пока он не станет равным x1,// затем вызвать `function_when_eq`, и выйти
not_equal:
    add  x0, x0, #1
    cmp  x0, x1
    bne  not_equal
    bl   function_when_eq
exit:
    ...
// вызывается когда x0 == x1
function_when_eq:
    ret

Using the value:


cmp  x1, #0
beq  x1_is_eq_to_zero

Please note: if the branching did not work, then execution simply continues with the next instruction.


Generalization


The ARMv8 instruction set has many more instructions. You already know the most basic and this will be enough to easily deal with most of the rest of the instructions. Instructions are described in ( ref : C1.2.4). For a quick reference to the above instructions, see This Griffin Dietz ISA Cheat Sheet. Before continuing, answer a couple of questions in the name of a self-test:


How could you write memcpyin assembler ARMv8? [arm-memcpy]

Suppose that the source address is in x0, the address of where to put in x1, and the number of bytes in x2(guaranteed to be more than zero and divided by 8 completely). How would you implement memcpy? Be sure to follow the ret
hint at the end : This function can be implemented in 6-7 lines of assembler code.

How will you write the value 0xABCDEin ELR_EL1? [arm-movk]

Suppose that the program is running in EL1, how would you write directly 0xABCDEto the register ELR_EL1using the ARMv8 assembly?
Hint : It will take three instructions.

What does the instruction cbzdo? [arm-cbz]

Read the manual documentation cbz( ref : C6.2.36). What does this instruction do? What can it be used for?

What is doing init.S? [asm-init] A

file os/kernel/ext/init.Sis a part of the kernel that runs before everyone else. In particular, the symbol _startwill be located at the address 0x80000after the initialization of the raspberry firmware. A little later we will fix this file so that it switches to EL1 and adjusts the exception vectors.

Read the file os/kernel/ext/init.Suntil about context_save. Then, for each comment in the file indicating how something works, explain what this code does. For example, to explain two comments (“read cpu affinity”, “core affinity! = 0”), we can say something like this:

The first two bits of the register MPIDR_EL1( ref : D7.2.74) are read ( Aff0), which gives us the number of the kernel that is currently executing our code. If this number is zero - go to setup. Otherwise, the core we euthanize the core with the help wfeto save energy.
Hint : Refer to the manual for any instruction / register that you are not familiar with.

Subphase C: Switch to EL1


In this subphase, we will write assembly code to switch from EL2 to EL1. The main work is in files os/kernel/ext/init.Sand os/kernel/src/kmain.rs. It is recommended to switch to this subphase only after you have answered the questions of the previous subphases.


Current Exception Level


We have already added some functions in the module aarch64( os/kernel/src/aarch64.rs), which use the assembly language inside to access low-level information about the system. For example, a function sp()allows you to retrieve the current stack pointer at any time. Or a function current_el()that returns the current level of exceptions. We already mentioned that the percent will work in EL2 when the kernel starts. Confirm this by printing to the kmain()current exception level. Please note that a call current_el()is required unsafe. We will remove this call when we are convinced that we have successfully switched to the EL1 level.


Switching


Add some assembler code to switch to EL1. Find this line in os/kernel/ext/init.S:


// FIXME: Returnto EL1 at `set_stack`.

Right after it there are a couple of assembler instructions:


mov     x2, #0x3c5
msr     SPSR_EL2, x2

From the previous subphase, you should know what they are doing. In particular, you should know which bits to set SPSR_EL2and what will be the consequences of this after the call eret.


Add the switching code, replacing it FIXMEwith the correct instructions. Make sure that the percent correctly switches to EL1 CPU and jumps to set_stack, after which the kernel tuning continues. You will need exactly three instructions to complete the code. Recall that the only way to reduce the level of exclusion is through eret. Upon completion, make sure it current_el()returns now 1.


Hint : What register is used to install the PC when returning from the exception?

Subphase D: Exception Vectors


In this subphase, we will install and configure exception vectors and handlers of these very exceptions. This will be the first step to ensure that our nucleolus can handle arbitrary exceptions and interrupts. You will check your processing code for this all by writing a minimalistic debugger that runs in response to brk #n. The main work in the file kernel/ext/init.Sand directory kernel/src/traps.


Overview


Recall that the table of exception vectors consists of 16 vectors, where each vector is a series of no more than 16 commands. We allocated space in init.Sfor these vectors and put a label _vectorsin the table base. Your task is to populate table 16 with vectors so that eventually the handle_exceptionRust function is kernel/src/traps/mod.rscalled with the appropriate arguments when an exception occurs. All exceptions will be redirected to the function handle_exception. The function will determine why an exception occurred and send an exception to higher-level handlers as needed.


Call Conventions


To properly call a function handle_exceptiondeclared in Rust, we need to know how the function will be called. In particular, we need to know where the function should expect to find values ​​for its parameters info, esrand tfwhat it promises about the state of the machine after the function is called and how it will return control.


This problem of knowledge of calling external functions arises whenever one language calls another (as in Lab 2 between C and Rust). Instead of studying how each individual PL does this, standards and call agreements are used. A calling convention or procedure call standard is a set of rules that defines the following:


  • How to pass parameters to a function. On AArch64, the first 8 parameters are passed through the registers r0... r7in direct order from left to right.
  • How to return values ​​from a function. In the first 8 AArch64 returned values are transferred through the registers r0... r7.
  • What state (registers, stack, etc.) should the function save.
    Registers are usually divided into caller-saved or callee-saved .
    caller-saved - are not guaranteed to be saved after a function call. Thus, if caller requires storing the value in a register, it must save the value of the register before calling the function.
    And vice versa. callee-saved - guaranteed saving during a call. Those. the function called must take care of these registers and return them in the same form in which they were transferred to it.
    Register values ​​are usually saved and restored using the stack.
    On AArch64 registers r19... r29and SP- callee-saved. The rest are caller-saved . Please note that lr( x30) is also included here. SIMD / FP registers have nontrivial rules regarding storage. For our purposes, it will suffice to say that they are also caller-saved .
  • How to transfer control back. AArch64 has a register lrthat contains a link to the return address. The instruction retgoes to the address from lr.

In AArch64, all these conventions in expanded form can be read in ( guide : 9) and in procedure call standard . When you call a
Rust function handle_exceptionfrom assembler, you need to make sure that you follow all these conventions.


How does Rust know which agreement to use?

If you strictly adhere to the calling conventions, this excludes all kinds of optimizations with function calls. As a result, by default, the Rust functions do not guarantee compliance with any calling conventions. In order to force Rust to use a platform agreement function when compiling, you need to add a qualifier to this function extern. We have already announced handle_exceptionhow, externtherefore, we can be sure that Rust will compile the function in the expected way.

Vector table


In order to help you fill out the vector table, we have provided a macrowithHANDLER(source, kind), which contains a sequence of six instructions and necessary alignment notes. When HANDLER(a, b)used as an “instruction”, it is expanded to the lines that follow #define. Those. here is a record:


_vectors:
    HANDLER(32, 39)

It will become like this:


_vectors:
    .align 7
    stplr, x0, [SP, #-16]!
    movx0, #32movkx0, #39, LSL#16blcontext_saveldplr, x0, [SP], #16eret

This saves code lrand x0on a stack and produces a x032-bit value of 16 bits sourceand 16 bits kind. Then called context_save, declared before _vectors. After the function gives control, lrthey are x0restored from the stack and at the end the exception is exited.


The function context_saveis currently doing nothing. Just falls through to retout context_restore. We will change a little later context_saveso that it correctly calls a function from Rust.


Syndrome


When a synchronous exception occurs (an exception caused by the execution or attempt to execute the instruction), the percent sets a value in the syndrome register ( ESR_ELx) which describes the reason for this exception ( ref : D1.10.4). Structures for handling this can already be found in kernel/src/traps/syndrome.rs. There are also some blanks for analyzing the significance of the syndrome for creating an Syndromeenumeration. A little later, you will write code that passes the value ESR_ELxto Rust as a parameter esr. Then use Sydnrome::from(esr)to parse in order to determine what to do next.


Info


The function handle_exceptiontakes a structure as the first parameter Info. This structure has two 16-bit fields: sourceand kind. As you might have guessed, this is the 32-bit value that the macro HANDLEsets to x0. You will need to make sure that you are using the right HANDLEcalls for the correct entries so that the structure Infois created correctly.


Implementation


Now you are ready to write the minimum exception handling code. The first exception that we will handle is brk, i.e. breakpoint. When such an exception occurs, we need to launch an interactive shell that theoretically allows us to examine the state of the machine at this point.


To get started, let's insert a call brkinto kmain. Using an assembler insert like this:


unsafe { asm!("brk 2" :::: "volatile"); }

Then we proceed as follows:


  1. Fill the table _vectorsusing a macro HANDLE. Make sure your records will create the structure correctly Info.
  2. Call handle_exceptionoutcontext_save .
    Be sure to save / restore all caller-saved registers as needed and pass the appropriate parameters. You must use 5 to 9 instructions. At the moment, you can pass 0instead of the parameter tf. We will use this parameter later.
    Note. AArch64 requires that it SPbe aligned by 16 bytes whenever it is used for boot / restore. Be sure to comply with this requirement.
  3. Set up the register VBARusing the mark in the code:
    // FIXME: load`_vectors` addr into appropriate register (guide: 10.4)
  4. At this point, it handle_exceptionshould be called whenever an exception occurs.
    The handle_exceptiontype value parameters infoand esrto make sure that they are what you expect. Then put an endless loop. In order to make sure that the cycle is not deleted by optimization, you can put it there aarch64::nop(). We will need to write more code to correctly return from the exception handler, so we’ll just block everything and everyone for now. We will fix this in the next subphase.
  5. Implement methods Syndrome::from()and Fault::from().
    In this case, the first method should call the second. You will need to refer to ( ref : D1.10.4, ref : Table D1-8) in order to implement everything correctly. Click on the “ISS encoding description” in the table to view detailed information on how to decode the syndrome for a specific class of exceptions. For example, you should make sure that the syndrome for is brk 12decoded as Syndrome::Brk(12), and for is svc 77decoded as Syndrome::Svc(77). Please note that we excluded the 32-bit variants of some exceptions and combined the exceptions when they are identical, but occur with different classes of exceptions.
  6. Run the shell when an exception occurs brk.
    Use method Syndrome::from()in handle_exceptionto detect an exception brk. When such an exception occurs, run the shell. You can use a different shell prefix to distinguish between shells. Note that for synchronous exceptions you must call Syndrome::from(). Otherwise, the register ESR_ELxwill not contain a valid value.
    At this point, you will also need to change the shell and implement the command exit. When it is invoked in the shell exit, it must end the loop and return control. This will allow us to exit the exception later brk. Along with such a change, you may need to wrap the call shell()from kmaintoloop { } in order to prevent kernel crashes.

Once you are done, the instruction brk 2in kmainshould throw an exception with the syndrome Brk(2), source, equal CurrentSpElxand kind equal Synchronous. At this point, the debug shell should be called. When a shell command is called exit, the shell should stop working and the exception handler should fail in an infinite loop.


Before proceeding, you must ensure that you correctly define other synchronous exceptions. You should try to call other instructions that throw an exception, such as svc 3. You should also try to deliberately create a data or command interrupt by going to an address outside the physical memory range.


As soon as everything works as you expected, you are ready to move on to the next step.



UPD : next part


Also popular now: