lorc March 19, 2010 at 07:02

How ARM loads

My last topic was completely theoretical, this one will be practical. The practice will be quite hardcore (I myself dealt with this issue only after a year of working with ARMs) - initialization of the processor and memory. In other words: what needs to be done with the processor to get into the function main(). The first part of the article is devoted to assembly and debugging tools. The second is for handling exception vectors, the third is for initializing stacks and memory.
But first, I want to make one clarification. For some reason, many people think that ARM is necessarily a monster with external memory, a bunch of bindings, operating at a frequency of at least 600Mhz, etc. This is only partly true (if we talk about ARM9 and later families). The chip that I usually work with (AT91SAM7X512) is not much more complicated than the familiar AVRs. He needs only quartz and food to work (it is possible without quartz, but then it will be completely sad). All. But of course he has more opportunities, many more than AVR. But more on that later. Today's article will not be tied to a specific hardware.

Compilers, linkers, debuggers

A question that worries so many. There are paid (IAR, Keil MDK, CrossWorks) and free (gcc-arm). I will use gcc-arm in the examples. For Windows there are assemblies WinARM (seems to be dead), YAGARTO. In principle, you can collect your own. There is still such a fun thing like coLinux, but that's a completely different story. Under Linux, the cross-compiler is usually built using the standard distribution tools. Read the docks in general :)
There is still such a useful thing as a standard library. The one that implements functions like printf, mktime, mallocand everything else that C programmers are used to. Using glibc will not work, because it is too large. Instead, they usually use the free newlib. It is part of WinARM / YAGARTO, but Linux users will have to manually assemble it. Again - read the documentation :)
With debuggers a little more complicated. You can use emulators, but they are pretty buggy when it comes to peripherals. Here I have no experience. You can use debug messages in the COM port. I have been doing this all my life. I have enough in 99% of cases.
But the coolest thing is JTAG. A device that connects to the processor and allows debiting the code directly in the stone (setting breakpoints, tracing, viewing / changing memory, etc.). True, it costs money, on the one hand, on the other - on the board it will be necessary to raise legs under it.

Exception handlers

Okay, we will assume that the compiler is installed and configured. Let's run something now. Let's start from the very beginning: what happens when the processor is reset (for example, after the power is turned on and the voltage has settled). Everything is simple here: the processor starts to execute the program from address 0x0. It would seem that you can place an initialization code from this address and work for yourself. But not so simple. The fact is that in the starting addresses the vectors of exception handlers are stored.
For example, if an interrupt occurs, the processor will start processing it from address 0x18, and the exception “unknown instruction” will be processed from address 0x04. In general, the first 28 bytes are reserved for the table of exception handlers (reset is also an exception).
arm exception vectors

The figure shows this more clearly. From the figure, it can be seen that 4 bytes are allocated for each processor, or one processor instruction. (In ARM mode. All handlers are called in this mode of instructions.)
Accordingly, the first thing we should do is write exception handlers and place them correctly. Let 's do this : What does this code do? These are commands to load the address of real handlers into the register . Such an unconditional transition. Further along the code are the variables storing these same addresses: Here it was possible to apply several tricks that accelerate the processing of interrupts. For example, as you can see, the FIQ handler is the last in the list, so that the processing of this interrupt could be started right on the spot.

ldr pc, ResetHandlerAddr

ldr pc, UndefHandlerAddr

ldr pc, SWIHandlerAddr

ldr pc, PrefetchAbtHandlerAddr

ldr pc, DataAbtHandlerAddr

nop

ldr pc, IRQHandlerAddr

ldr pc, FIQHandlerAddr

pc

ResetHandlerAddr: .word ResetHandler

UndefHandlerAddr: .word UndefHandler

SWIHandlerAddr: .word SWIHandler

PrefetchAbtHandlerAddr: .word PrefetchAbtHandler

DataAbtHandlerAddr: .word DataAbtHandler

IRQHandlerAddr: .word IRQHandler

FIQHandlerAddr: .word FIQHandler

It was also possible to use AIC (advanced interrupt controller) registers to directly go to the interrupt handler. But until we complicate our lives. So far, only Reset processing is important to us.
So let's write the handlers themselves as simple as possible. They will hang the processor (endlessly executing the command of unconditional transition to themselves). Anyway, we don’t know yet how to handle exceptions, so a dangling processor is perfectly acceptable. Is the Branch command. The next thing we need to do is set up the stack pointers for each of the operating modes. Thus, if exceptions occur, the handler will already have its own stack. Only at first we will describe the sizes of all stacks.

UndefHandler: B UndefHandler

SWIHandler: B SWIHandler

PrefetchAbtHandler: B PrefetchAbtHandler

DataAbtHandler: B DataAbtHandler

IRQHandler: B IRQHandler

FIQHandler: B FIQHandler

B
sp

.EQU IRQ_STACK_SIZE, 0x100

.EQU FIQ_STACK_SIZE, 0x100

.EQU ABT_STACK_SIZE, 0x100

.EQU UND_STACK_SIZE, 0x100

.EQU SVC_STACK_SIZE, 0x100

In order not to suffer for a long time, we allocate 256 bytes per stack for each mode. In fact, for most of these modes - this is a lot. Although it all depends on the handlers. As you can see, the sizes for 5 of 6 modes are described here. The remaining memory will be shared between the heap and the stack of the sixth (user mode) mode.
Now we describe the constants to facilitate the transition to different modes. The current mode is the CPSR register. He also performs the role of the status register. The constants and are bits that prohibit simple and fast interrupts, respectively. Now we are ready to initialize the stacks. This is done simply: load the pointer to the top of the stack in the register, then go to the desired mode, write to the value , then decrease

.EQU ARM_MODE_FIQ, 0x11

.EQU ARM_MODE_IRQ, 0x12

.EQU ARM_MODE_SVC, 0x13

.EQU ARM_MODE_ABT, 0x17

.EQU ARM_MODE_UND, 0x1B

.EQU ARM_MODE_USR, 0x10


.EQU I_BIT, 0x80

.EQU F_BIT, 0x40

I_BITF_BITr0spr0r0 on the size of the stack and repeat.

.RAM_TOP:

.word __TOP_STACK

ResetHandler:

ldr sp, .RAM_TOP


msr CPSR_c, #ARM_MODE_FIQ | I_BIT | F_BIT

mov sp, r0

sub r0, r0, #FIQ_STACK_SIZE


msr CPSR_c, #ARM_MODE_IRQ | I_BIT | F_BIT

mov sp, r0

sub r0, r0, #IRQ_STACK_SIZE


msr CPSR_c, #ARM_MODE_SVC | I_BIT | F_BIT

mov sp, r0

sub r0, r0, #SVC_STACK_SIZE


msr CPSR_c, #ARM_MODE_ABT | I_BIT | F_BIT

mov sp, r0

sub r0, r0, #ABT_STACK_SIZE


msr CPSR_c, #ARM_MODE_UND | I_BIT | F_BIT

mov sp, r0

sub r0, r0, #UND_STACK_SIZE


msr CPSR_c, #ARM_MODE_USR

Memory initialization

Now we are in unprivileged mode with interrupts turned on and the stack configured. By the way, getting out of this mode is simply impossible. Only by throwing an exception. But more on that in the next article. There is just a little bit left
before going into the function main(). It is only necessary to transfer some data to RAM and reset the memory, which is located in the .BSS segment. This is the memory where global variables are stored. The fact is that the C language standard promises that global variables will be reset to zero at the beginning of the work, and ARM does not guarantee this to us. Therefore, we manually reset the segment:

               MOV     R0, #0
               LDR     R1, =__bss_start__
               LDR     R2, =__bss_end__
LoopZI:
               CMP     R1, R2
               STRLO   R0, [R1], #4
               BLO     LoopZI

Constants are __bss_end__ & __bss_start__kindly provided to us by the linker.
By the way, here you can observe the use of conditional instructions (with the suffix O). They will be executed while R1! = R2.
You also need to transfer pre-initialized variables (those that are int x=42) from ROM to RAM .

               LDR     R1, =_etext
               LDR     R2, =_data
               LDR     R3, =_edata
LoopRel: 
               CMP     R2, R3
               LDRLO   R0, [R1], #4
               STRLO   R0, [R2], #4
               BLO     LoopRel

If we write in C ++, then we still need to call the designers of global objects:

               LDR     r0, =__ctors_start__
               LDR     r1, =__ctors_end__
ctor_loop:
               CMP     r0, r1
               BEQ     ctor_end
               LDR     r2, [r0], #4
               STMFD   sp!, {r0-r1}
               MOV     lr, pc
               BX r2
               LDMFD   sp!, {r0-r1}
               B       ctor_loop
ctor_end:

Well, in general, and everything. Call main():

               ldr     r0,=main
               bx      r0

Congratulations, now we are in the function void main(void). You can do the initialization of the periphery. The fact is that before that we only initialized the software environment. Therefore, the processor now operates at the lowest frequency possible, all peripherals are disabled. You won’t get around here :)
But the initialization of the periphery is a thing that depends on a specific piece of hardware, and the purpose of this article is to tell how to run abstract ARM.
And a few more nuances: this code cannot be directly compiled and run, because the sections where it is located are not described here. Also, I did not provide linker scripts (these scripts describe the placement of sections of code and data in memory and in the firmware image).
But the Internet is full of ready-made examples for running a particular piece of iron. With scripts, makefiles and all-all-all. Look on manufacturers' websites :)

The next article, apparently, will again be devoted to theory, this time to a description of processor modes and exceptional situations.

Tags:

How ARM loads

Compilers, linkers, debuggers

Exception handlers

Memory initialization

Also popular now: