
UNIX® on Game Boy Advance
- Transfer
Introduction
In this publication, I talk about gbaunix, a fun experiment in which I launched an ancient version of the UNIX operating system in a simulator on a popular (at the time of publication, in 2004 - approx. Transl.) Portable console. Namely, UNIX 5th edition (released in 1974, 39 years ago) on the Nintendo Game Boy Advance. This may be interesting for homebrew developers for Gameboy, IT students with specialization in OS, emulators or compilers, Unix geeks.

Nintendo has been on the game market since 1889 (sic!). A gameboy is a brand of an entire line of portable consoles, which has been very successfully sold to this day. Since its release in 1989, 175 million units have been sold. This article discusses Game Boy Advance, abbreviated GBA, and its counterpart, GBA SP.
Iron:
- 32-bit ARM processor (RISC), clocked at 16.78 MHz
- 32-bit ARM processor (RISC), clocked at 16.78 MHz
- Z80 8-bit processor (CISC), frequency 4.2 or 8.4 MHz. Added for compatibility with old gameboys, they had it CPU
- 4 16-bit timers
- 4 channels DMA
- Color TFT screen, resolution 240x160
- Stereo sound through headphones
- 10 buttons
- Serial port
- Interface for GamePak cartridge
The console has several memory blocks:
- 16 KB BIOS ROM
- 256 KB of external RAM soldered to the board (EWRAM)
- 32 KB internal RAM, on a CPU chip (IWRAM)
- 1 KB RAM for background and sprite palette
- 96 KB Video RAM
- 1 KB RAM for object attributes
- Up to 32 MB ROM cartridge
- Up to 64 KB SRAM cartridge, optional
It should be noted that the cartridge ROM has two additional mappings in the address space of the console. In addition, the cartridge may contain several memory banks with different sizes or timings.
GBA can be loaded from a regular cartridge, a rewritable flash cartridge, another GBA in master mode, or even from a computer via cable. This functionality is protected in BIOS.
The console hardware is mainly accessible through a well-documented address space. Various I / O registers are mapped to memory addresses. In addition to this, the BIOS contains many functions available through software interrupts.
Lyric digression about ARM architecture
Nintendo is the leader in the console market, and ARM is the leader in the RISC processor market. Now AWPs are present in almost all electronics. Intel steers on desktops, yes. But processors for PCs make up a very small fraction of total production. In 2003, 782 million devices were manufactured on AWPs. In general, the share of this architecture in the region is 75%.
Most often, AWPs are used in the same sentence with the words “built-in”, “productive”, “cheap”, “low-fat”, “RISC”. AWP licenses architecture directly to manufacturers of processors, including such large ones as Intel, Apple, Samsung.
The first workstation was developed at Acorn Computers Limited in the mid-80s. Then it was deciphered as "Acorn RISC Machine". The very first version of the AWP architecture, ARMv1, supported 26-bit addressing and was very slow. The first processor of this architecture, ARM1, was a peripheral processor in the Air Force microcomputer and the Archimedes workstation prototype. And it was one of the first RISC architecture processors. Main features: deferred branching, register windows, each instruction is executed in one clock cycle.
Apple tried to use AWPs in the early 90's. In collaboration with Acorn, they launched a new company, Advanced RISC Machines, Limited. Another accomplice was VLSI Technology. The decoding of the abbreviation has changed. The new processor was called ARM6, architecture - ARMv3. There is a completely 32-bit addressing. This processor was used in Apple Newton.
Gameboy uses the ARM7TDMI processor, the architecture of the ARMv4T version.
ARM7TDMI is a popular embedded 32-bit processor with slightly limited functionality. It does not have a cache and MMU. Designation deciphered as follows:
Processor ARM7
supports the 16-bit instructions T humb
Supports D ebug directly in hardware
has built a 32-bit unit at the M multiplication, the result - 64-bit
There Embedded I CE for debugging
The ARM7TDMI core has one 32-bit bus for data and instructions. Data can be 8, 16, 32 bits in size. Only download, save, and share instructions can access memory. There are 31 32-bit general-purpose registers, 6 status registers, a shift register, arithmetic and multiplication blocks. Not all registers are available at the same time. For example, in ARM mode, 16 general and 1 or 2 status registers are available. There are 7 processor modes: normal for executing programs, and 6 special ones. These are fast interrupt, interrupt, supervisor, stop, undefined state and system mode. Two sets of instructions are also supported - ARM and Thumb.
The gameboy supports 4 DMA channels. The processor has two types of interrupts - regular IRQ and fast FIQ, but only regular ones are used in the console.
ARM7 has a simple three-
stage pipeline: The command is retrieved from the memory and placed in the queue.
The command is decoded.
The command is executed. In this case, reading from the registers, calculation of the results and writing from to the registers takes place.
Thus, at any time, one command is executed, the next is decoded, and after one it is retrieved.
The relative disadvantage of RISC processors is the relatively voluminous code. This increases the size of the program and leads to loss of cache efficiency, excessive memory traffic and power consumption. For embedded applications, this is especially bad. To compact the code, we developed the Thumb architecture.
Thumb is a 16-bit compressed system of 32-bit ARM commands. A set of the most popular instructions is supported. They work with the same 32-bit registers. Thumb-enabled processors are equipped with a decoder in the pipeline, which converts them into regular ARM instructions. The difference is something like this:
A commonly used mix of ARM and Thumb commands. The type of command to be executed is indicated by a special flag. Gameboy has 32 KB of fast memory right on the chip. Usually, speed-critical code is executed from there. Everything else is appropriate to compose in Thumb and execute from the slow cartridge memory.
habrahabr.ru/post/92494 - useful reading in general about ARM. - approx. perev.
ARM
Nintendo is the leader in the console market, and ARM is the leader in the RISC processor market. Now AWPs are present in almost all electronics. Intel steers on desktops, yes. But processors for PCs make up a very small fraction of total production. In 2003, 782 million devices were manufactured on AWPs. In general, the share of this architecture in the region is 75%.
Most often, AWPs are used in the same sentence with the words “built-in”, “productive”, “cheap”, “low-fat”, “RISC”. AWP licenses architecture directly to manufacturers of processors, including such large ones as Intel, Apple, Samsung.
The first workstation was developed at Acorn Computers Limited in the mid-80s. Then it was deciphered as "Acorn RISC Machine". The very first version of the AWP architecture, ARMv1, supported 26-bit addressing and was very slow. The first processor of this architecture, ARM1, was a peripheral processor in the Air Force microcomputer and the Archimedes workstation prototype. And it was one of the first RISC architecture processors. Main features: deferred branching, register windows, each instruction is executed in one clock cycle.
Apple tried to use AWPs in the early 90's. In collaboration with Acorn, they launched a new company, Advanced RISC Machines, Limited. Another accomplice was VLSI Technology. The decoding of the abbreviation has changed. The new processor was called ARM6, architecture - ARMv3. There is a completely 32-bit addressing. This processor was used in Apple Newton.
Gameboy uses the ARM7TDMI processor, the architecture of the ARMv4T version.
ARM in gameboy
ARM7TDMI is a popular embedded 32-bit processor with slightly limited functionality. It does not have a cache and MMU. Designation deciphered as follows:
Processor ARM7
supports the 16-bit instructions T humb
Supports D ebug directly in hardware
has built a 32-bit unit at the M multiplication, the result - 64-bit
There Embedded I CE for debugging
The ARM7TDMI core has one 32-bit bus for data and instructions. Data can be 8, 16, 32 bits in size. Only download, save, and share instructions can access memory. There are 31 32-bit general-purpose registers, 6 status registers, a shift register, arithmetic and multiplication blocks. Not all registers are available at the same time. For example, in ARM mode, 16 general and 1 or 2 status registers are available. There are 7 processor modes: normal for executing programs, and 6 special ones. These are fast interrupt, interrupt, supervisor, stop, undefined state and system mode. Two sets of instructions are also supported - ARM and Thumb.
The gameboy supports 4 DMA channels. The processor has two types of interrupts - regular IRQ and fast FIQ, but only regular ones are used in the console.
ARM7 has a simple three-
stage pipeline: The command is retrieved from the memory and placed in the queue.
The command is decoded.
The command is executed. In this case, reading from the registers, calculation of the results and writing from to the registers takes place.
Thus, at any time, one command is executed, the next is decoded, and after one it is retrieved.
Thumb
The relative disadvantage of RISC processors is the relatively voluminous code. This increases the size of the program and leads to loss of cache efficiency, excessive memory traffic and power consumption. For embedded applications, this is especially bad. To compact the code, we developed the Thumb architecture.
Thumb is a 16-bit compressed system of 32-bit ARM commands. A set of the most popular instructions is supported. They work with the same 32-bit registers. Thumb-enabled processors are equipped with a decoder in the pipeline, which converts them into regular ARM instructions. The difference is something like this:
- Saves 35% of the code
- 40% more teams needed
- Code is 40% slower for 32-bit memory
- But 60% faster for 16-bit
- 30% less power consumption
A commonly used mix of ARM and Thumb commands. The type of command to be executed is indicated by a special flag. Gameboy has 32 KB of fast memory right on the chip. Usually, speed-critical code is executed from there. Everything else is appropriate to compose in Thumb and execute from the slow cartridge memory.
habrahabr.ru/post/92494 - useful reading in general about ARM. - approx. perev.
High level gbaunix architecture
gbaunix is a UNIX 5 edition running on Gameboy. For this, SIMH is used, a simulator of different antique computers written in C. SIMH can emulate a lot of other things, but only PDP-11 is used here. There are several C toolchains for Gameboy, so it turned out to port SIMH to Gameboy with little blood. The architecture of the resulting system is shown below:

The gbaunix cartridge is a concatenation of the simulator runtime and the disk image from the OS. You can leave the rest of the space empty, or take something useful.
The simulator is a minimally modified SIMH. Game-specific features are implemented in a separate level of abstraction.
TTY conclusion.
gbaunix simulates a text terminal for SIMH. The output is redirected to the procedure performed on the Gameboy, which formats it and displays it on the screen. Scrolling is supported.
printf()
splits into sprintf()
a buffer and displays it on the screen.TTY input.
Now there is no sane input support at all. You can only execute a sequence of shell commands defined during compilation. It is specified in the file
gba/gba_kbd.h
. When UNIX is running, the Start button issues the next command from the list to execute. It is possible to monitor button presses, polls or through interruptions. Example:/* gba/gba_kbd.h */
const char *gba_kbdinput[] = {
"unix\r",
"root\r",
"chdir /work\r",
"ls -l\r",
"./fact 100\r",
"cat hanoi.c\r",
"./hanoi\r",
"./hanoi 3\r",
"chdir /tmp\r",
"echo 'main() { printf(\"Hello, World!\\n\"); }' \
> hello.c\r",
"cc hello.c\r",
"./a.out\r",
... /* more commands */
NULL,
};
File system.
The disk image takes 2.5 MB, it fits only in the cartridge ROM. Although this memory is read-only, you need to somehow handle writing to it. To do this, each write operation creates a buffer in RAM. All read and write operations are first checked for any buffer. As the number of such buffers increases, they merge. The virtual stdio is slipped to the simulator.
Memory.
Simulated PDP-11 has 128 KB of available 256 KB of EWRAM available in hardware. There is some optimization of working with memory like using DMA.
Misc.
Here, any code such as initialization of the runtime, TTY, interrupts, file system, etc.
Development
gbaunix works both on the Gameboy emulator and on the iron console. To run in hardware, you need a flash cartridge and a programmer for it.
It is much more convenient to use the Gemyboy emulator, especially if there is a desire to tinker with the source. gbaunix is still very conditionally optimized, and it runs very slowly on the iron console. The emulator will allow you to work at the highest available speed, and this will save a lot of nerves.
My development environment is deployed under Mac OS X. I use the Boycott Advance emulator and the devkitARM toolchain, compiled from raw materials. Checked on a real console with a flash cartridge.
Lyrical digression about the history of UNIX
Here was a large text, not directly related to the subject of the article, but too curious to get rid of. Took out in a separate post - habrahabr.ru/post/194160
Gbaunix demo, with comments
Fifth Edition, June 1974
I am using the fifth edition because it is the oldest version for which there are electronically downloadable system images and kernel source. We will try to recompile it, because it is tru.
When you turn on the power, gbaunix will show some information about the PDP-11 hardware and the bootloader prompt
@
. If clicked Start
, the command queue will give the kernel name to load - unix
. Loading on an iron Gameboy takes about two minutes. After downloading, an invitation to log in appears. 
The semicolon in the invitation ( ; login is also a paper magazine published by the USENIX Association) is necessary for the terminal, popular in the early 1970s
Teletype model 37
. She puts it in full duplex mode. All other terminals, including our GBA TTY, simply display the character on the screen. Even such an early version of UNIX has block and character device files.
RK0
this is the first disk, /dev/mem
- displaying the system memory for debugging with a debugger and applying patches to the hot one. 
glob
, An abbreviation of global
it external to the team for Schell metacharacters disclosure *
and ?
in the arguments to commands. glob
expands metacharacters and invokes the desired command. If there are no matches, a traditional error is thrown No match
. 
It is curious that some commands like are
mkfs
hidden in /etc
, away from an accidental call with crooked hands.dc
Is a calculator with reverse polish notation. This is the first program running on PDP-11, even before creating the UNIX version for this computer. A typical fifth edition kernel takes up less than 26 Kb. Shell takes 5738 bytes, and
/init
only 1972 bytes. There is also a minimally decent script /etc/rc
. /etc/update
updates the superblock of the file system every 30 seconds to reduce loss on failure. The detailed output of the command ls
includes access rights, number of links, owner, size in bytes, time of the last change, and name.
By the era of the fifth edition, a rich development infrastructure with support for many programming languages had already been created. For example, Algol-68, APL, Assembler, BASIC, C, FORTRAN, M6, PASCAL, Snobol, TMG. In principle, gbaunix allows you to program directly on the Gameboy in several languages (this is in theory, in reality it is at least inconvenient due to the lack of a keyboard). The package includes compilers / interpreters for C, assembler PDP-11, BASIC, shell and Fortran. I also tried Algol, but did not add it to the disk image. For example, showing the tower of Hanoi.



More screenshots can be viewed on a special page with screenshots
Subsequent editions
The sixth edition (May 1975) left a mark in history because BSD and Xenix had their roots in it. John Lyons wrote his famous " Lyons Comments on the 6th version of UNIX, with source code ." It is also the earliest fully preserved version of UNIX. The documentation for the fifth edition has been lost; almost nothing remains of the fourth and earlier ones. Since the sixth edition, the development of UNIX systems has noticeably revived. Then came the seventh edition in January 1979, the eighth in February 1985, the ninth in September 1986 and the tenth in October 1989.
BSD
On Gameboy, you can run several more systems supported in SIMH. But it is more and more complicated, and rests on the available RAM. A couple of screenshots from BSD 2.9:


Gbaunix optimization
Gbaunix has the prerequisites for experiments that affect the speed of work. But this will require recompilation.
Mercy Code in IWRAM
Gbaunix has examples of compiling code for ARM and storing it in IWRAM. This will probably be most useful for processor simulation code.
DMA
The gameboy allows working with memory in several ways, with different performance and limitations. gbaunix uses DMA3 (general purpose) for
memcpy()
and functions memset()
. Moreover, the BIOS contains functions for copying and filling memory through software interrupts. In general, gbaunix supports fine tuning of work with memory.Caching
As I said before, gbaunix emulates stdio and a virtual disk, presenting cartridge memory as a UNIX file. Since the hardware does not allow writing to ROM, you have to do with buffers, as mentioned above. gbaunix can preload disk fragments into buffers. This allows you to significantly speed up the loading of the system.
Recompiling the UNIX Kernel
Recompiling the kernel does not provide any noticeable advantages for gbaunix, but this is interesting anyway. At least for comparison with the same procedure for modern systems. We can limit the core support of iron and thus save a little space.
Estimate the size of the source code of the fifth edition.
Headers: 418 lines in 13 files
C: 7222 lines in 43 files (including drivers for peripherals)
Assembler: 1080 lines in 2 files
I didn’t even try to compile the kernel directly on Gameboy. It takes an indefinitely long time, and we run out of buffer memory. Anyway, the equipment for such a task is too low-power.
The following sequence of commands assumes that there is a working installation of the fifth edition, on real hardware or on a simulator, and the kernel source in the default directory -
/usr/sys
Delete the libraries that may be left over from the previous compilation:
# chdir /usr/sys
# rm -f lib1
# rm -f lib2
Before starting the compilation, you need to study and correctly specify the parameters in the file
/usr/sys/param.h
. If the compiler throws an error "
undefined KISA0
", add the definition to /usr/sys/seg.h
:# echo '\#define KISA 0172340' >> /usr/sys/seg.h
Actually compilation:
# chdir ken
# cc -c -O *.c
...
# ar vr ../lib1 main.o alloc.o iget.o prf.o rdwri.o \
slp.o subr.o text.o trap.o sig.o sysent.o clock.o fio.o \
malloc.o nami.o pipe.o sys1.o sys2.o sys3.o sys4.o
...
Compilation of drivers and more. You can partially disable:
# chdir ../dmr
# cc -c -O *.c
...
# ar vr ../lib2 *.o
System configuration and linking. The output is a kernel binary:
# chdir ../conf
# cc mkconf.c
# mv a.out mkconf
# echo rk | ./mkconf
# cc -c c.c
# as l.s
# mv a.out l.o
# as mch.s
# mv a.out mch.o
# ld -x l.o mch.o c.o ../lib1 ../lib2
# mv a.out rkunix
rkunix
- This is a freshly compiled core. If you put it in the root directory /rkunix
, and when loading give a command rkunix
instead unix
, then it will load.Ideas and suggestions
A short list of promising ideas. Mostly academic interest
Game UNIX native port
Porting UNIX to Gameboy would be a good exercise for students as part of a course on operating systems. Productivity should increase dramatically. Ancient OSs are small enough to fit in the head of one person. The sixth edition consists of 44 files:
14 headers of C
28 files with code for C
2 files with assembler.
All together contains less than 9000 lines of code. This is with device drivers. Assembler there about 10%.
In my opinion, the task is quite feasible. There may be difficulties with assembler code and porting the compiler.
Performance improvement
Although I tried to speed up work where possible, it’s still a long way to the final solution. You can start with the processor simulation code (
pdp11_cpu.c
) - it runs the largest fraction of the time.Input mechanism
Now gbaunix just executes a command line that is wired when compiling. To get closer to reality, you need a way to enter commands by the user. Alternatively, you can offer a virtual keyboard that works normally with the terminal.
Game Mac emulation of the original Macintosh
The first Macintosh was not particularly advanced hardware: a 16-bit processor with a frequency of 8 MHz, 128 KB RAM, 64 KB ROM, no cache, no interrupts, a 400 KB floppy disk, a monochrome screen with a resolution of 512x342. Gameboy still has more powerful hardware, except for the screen. In principle, you can create or port a lightweight Macintosh emulator. Compensate for the lack of screen by scrolling, simulate writing to disk with buffers, etc.
Another option is to port the 8088 emulator to run the ancient version of DOS.
Where can I download
gbaunix-0.0.tar.bz2 It
should be noted that to run you must have a disk image RK05. It is not included, but it can be downloaded from the PDP Unix Preservation Society website if you agree to the terms of the license agreement.
http://minnie.tuhs.org/PUPS/
Using
If you just want to see, you can do without recompilation and put together a simulator runtime and a disk image:
% cat unixv5.tmp disks/unixv5.dsk > unixv5.gba
unixv5.gba
- This is a ready-to-launch image cartridge. It can be used with both the emulator and the iron console. If you want to recompile gbaunix (highly recommended, this is the only way to get fun from the process), then you will need a cross-compilation environment for the ARM architecture. You may need to fix the Makefile. The image of RK05 must be under the name
disks/unixv5.dsk
. After that, theoretically, one team should be enough make
.What else to read on this topic
Sources, binaries and documentation of old versions of UNIX
Dennis Ritchie Texts
SIMH Documentation