Creating an emulator arcade machine. Part 1

Transfer

Writing an arcade emulator is a great learning project, and in this tutorial we will look at the whole development process in great detail. Want to really understand the processor? Then creating an emulator is the best way to learn it.

You will need knowledge of C, as well as knowledge of assembly language. If you do not know assembly language, then writing an emulator is the best way to master it. You will also need to learn hexadecimal math (also known as base 16 or just “hex”). I will tell about this topic.

I decided to choose the emulator of the Space Invaders machine, which uses an 8080 processor. This game and this processor are very popular, so you can find a lot of information about them on the Internet. To complete the project you will need it.

All tutorial source code is uploaded to github . If you have not mastered working with git, then on the github page there is a “Download ZIP” button that allows you to download an archive with all the code.

Introduction to Binary and Hexadecimal Numbers

In “ordinary” mathematics, the decimal number system is used. Each digit of a number can have a value from zero to nine, and when we exceed 9, we add one to the number in the next digit and start again from zero. It's all quite simple and straightforward, and you probably never thought about it.

You may know or have heard that computers work with binary data. Computer geeks call decimal mathematics base-10, and binary ones call base-2. In binary terms, each digit of a number can have only two values, zero or one. In binary code, the counting is as follows: 0, 1, 10, 11, 100, 101, 110, 111, 1000. These are not decimal numbers, so you cannot call them “zero, one, ten, eleven, one hundred, one hundred one”. They are pronounced “zero, one, one-zero, one-one, one-zero-zero”, etc. I rarely read binary numbers out loud, but if necessary, I need to clearly indicate the number system used. Ten, eleven and a hundred do not make any sense in binary terms.

In decimal notation, the number has the following digits: units, tens, hundreds, thousands, tens of thousands, etc. In the binary system, the following digits: ones, twos, fours, eights, etc. In computer science, the value of each binary digit is called a bit. 8 bits are one byte.

In binary terms, the string of numbers quickly becomes very long. To represent a decimal number of 20,000 in binary terms, 16 digits are required: 0b100111000100000. To eliminate this problem, it is convenient to use hexadecimal notation, also known as base-16 (or hex). In base-16, each digit contains 16 values. For values from zero to nine, the same characters are used as in base-10, but for the remaining 6 values, substitutions are used in the form of the first 6 letters of the alphabet, from A to F.

The counting in hexadecimal is performed like this: 0 1 2 3 4 5 6 7 8 9 ABCDEF 10 11 12, etc. In hexadecimal dozens, hundreds and so on do not have the same meaning as in decimal, so people pronounce the numbers separately. For example, $ A57 is pronounced out loud as "A-five-seven." For clarity, you can also add hex, for example “A-five-seven-hex”. In hexadecimal, the analogue of the decimal number 20,000 is $ 4E20 - a much more compact form compared to the 16 bits of the binary system.

I think the hexadecimal system was chosen because of the very natural conversion from binary to hexadecimal and back. Each hex digit corresponds to 4 bits (4 bits) of the same binary number. 2 hex digits are one byte (8 bits). A separate hexadecimal digit can be called a nibble, and some people even write it through y, as “nybble”.

Each hex digit is 4 binary digits.
Hex	A	five	7
Binary	1010	0101	0111

When writing C code, it is considered that the number is decimal (base-10), unless it is marked otherwise. To tell the compiler to C, which is a binary number, we add the number zero and the letter b in lowercase, like this: 0b1101101. The hexadecimal number can be written in C code for adding zero at the beginning and lowercase x: 0xA57. In some assembly languages, the dollar sign $ is used to designate hex numbers $A57.

If you think about it, the connection between binary, hexadecimal and decimal numbers is quite obvious, but with the first engineer who had thought of this before the invention of the computer, this was supposed to be a moment of insight.

Got it all? Fine.

Brief introduction to the processor

If you already know this, you can safely skip the section.

The central processing unit (CPU, CPU) is a machine designed to run programs. The fundamental blocks of the CPU are registers and instructions. As a software developer, you can treat these registers as variables. In our processor 8080, among other registers, there are 8-bit registers called A, B, C, D and E. You can take these registers as the following code in C:

unsignedchar A, B, C, D, E;

All processors also have a program counter (Program Counter, PC). You can take it as a pointer.

unsignedchar* pc;

For a CPU, a program is a sequence of hexadecimal numbers. Each assembly language command in 8080 corresponds to 1-3 bytes in the program. In order to find out which team corresponds to which number, a handbook on the processor (or any other information about the 8080 processor from the Internet) is useful.

The names of commands (instructions) are often mnemonics from operations performed by these commands. The mnemonic for loading in 8080 is MOV (move), and ADD is used to perform addition.

Examples

The current value of the memory pointed to by the command counter is 0x79. This corresponds to MOV A,Cthe 8080 processor instruction . This assembly code in C code looks like A=C;.

If instead the value in the PC would be 0x80, then the processor would execute ADD B. In C, this corresponds to the string A = A + B;.

A complete list of 8080 processor commands can be found here . To implement our emulator, we will use this information.

Timings

In the CPU, each instruction requires a certain amount of time (timing), measured in cycles. In modern processors, this information is difficult to obtain, because the timings depend on many different aspects. But in old processors like 8080, the timings are constant and this information is often provided by the processor manufacturer. For example, the transfer instruction from register to register MOV takes 1 cycle.

Timing information is useful for writing efficient code in the processor. A programmer may strive to avoid instructions that take many cycles to execute.

More important for us is that we use information about timings to emulate a processor. For the game to work just like the original, instructions must be executed at the correct speed. Some emulators put a lot of effort into this, but when we get there, we’ll have to decide what accuracy we want.

Logical operations

Before we close the topic of binary and hexadecimal numbers, we should talk about logical operations. Probably, you are already accustomed to using logic in code, for example, in such constructions as if ((conditionA) and (conditionB)). In programs that work directly with hardware, you often have to manipulate individual bits of numbers.

AND operation (AND)

Here are all the possible results of the AND (AND) operation (truth table) between two single-bit numbers.

x	y	Result
0	0	0
0	one	0
one	0	0
one	one	one

The result of AND is equal to one only when both values are equal to one. When we combine two numbers with the AND operation, for each bit of one number, AND is executed with the corresponding bit of the other number. The result is stored in this bit of the recipient number. Probably better just look at an example:

	binary								hex
source x	0	one	one	0	one	0	one	one	$ 6B
source y	one	one	0	one	0	0	one	0	$ D2
x AND y	0	one	0	0	0	0	one	0	$ 42

In C, the logical AND operation is a simple ampersand "&".

OR operation (OR)

The OR operation works in a similar way. The only difference is that the result will be equal to one if at least one of the values of x or y is equal to one.

x	y	Result
0	0	0
0	one	one
one	0	one
one	one	one

	binary								hex
source x	0	one	one	0	one	0	one	one	$ 6B
source y	one	one	0	one	0	0	one	0	$ D2
x or y	one	one	one	one	one	0	one	one	$ Fb

In C, the logical OR operation is indicated by a vertical bar "|".

Why is it important?

In many old processors, and especially in arcade machines, the game often requires working with only one bit of the number. Often there is a similar code:

/* Пример 1: считываем с панели управления */char *buttons_ptr = (char *)0x2043;    
   char buttons = *buttons_ptr;    
   if (buttons & 0x4)    
    HandleLeftButton();
   /* Пример 2: включаем LED-индикатор на панели управления */char * LED_pointer = (char *) 0x2089;    
   char led = *LED_pointer;    
   led = led | 0x40; //задаём, что LED управляется битом 6    
   *LED_pointer = led;    
   /* Пример 3: отключаем один LED-индикатор */char * LED_pointer = (char *) 0x2089;    
   char led = *LED_pointer;    
   led = led & 0xBF; //маскируем бит 6    
   *LED_pointer = led;

In example 1, the memory address of $ 2043 is the address of the buttons on the control panel. This code reads and responds to the pressed button. (Of course, in Space Invaders this code will be in assembly language!)

In example 2, the game wants to light the LED indicator, which is located in bit 6 of the $ 2089 address allocated in memory. The code should read the already existing value, change only one bit, and write it back.

In example 3, you need to disable the indicator from example 2, so the code should reset bit 6 of the address $ 2089. This can be done by performing the AND operation with a value for which the indicator control byte has only bit 6. Thus, we will affect only 6, leaving the remaining bits unchanged.

This is usually called a “mask.” In C, the mask is usually written using the NOT operator denoted by a tilde ("~"). Therefore, instead of writing 0xBF, I will simply write down ~0x40and get the same number, but without investing much effort.

Introduction to Assembly Language

If you are reading this tutorial, you are probably familiar with computer programming, for example, in Java or Python. These languages allow you to do a lot of work in just a few lines of code. The code is considered to be cleverly written if it performs as much work as possible in as few lines as possible, perhaps even with the help of built-in library functionality. Such languages are called high-level languages.

In assembly language, on the other hand, there are no built-in life-simplifying capabilities, and simple tasks may require many lines of code. Assembly language is considered a low level language. In it, you have more to get used to thinking in the style of "what specific sequence of steps must be done to accomplish this task?"

The most important thing to know about assembly language is that each line is translated into one processor command.

Consider the following construction from the C language:

int a = b + 100;

In assembly language, this task will have to be performed in the following sequence:

Load the address of variable B into register 1
Load the contents of this memory address into register 2
Add immediate value 0x64 to register 2
Load variable address A into register 1
Write the contents of register 2 to the address stored in register 1

In the code, it will look something like this:

   lea    a1, #$1000        ; адрес переменной a    
   lea    a2, #$1008        ; адрес переменной b    
   move.l d0,(a2)    
   add.l  d0, #$64    
   mov    (a1),d0

It is worth noting the following:

In a high-level language, the compiler decides for itself where to place variables in memory. When writing code in assembler, you yourself are responsible for each memory address that you will use.
In most assembly languages, brackets mean "memory at this address."
In most assembly languages, # denotes an algebraic number, also called an immediate value. For example, in line 1 of the example above, the code actually writes the value # 0x1000 into register a1. If the code looked like move.l a1, ($1000), then a1 would get the contents of the memory at 0x1000.
Each processor has its own assembly language, and it can be difficult to transfer code from one processor to another.
This is not a real processor assembler language, I came up with it for an example.

However, there is one similar trait between high-level smart programmers and assembler masters. Assembly language programmers consider it an honor to complete the task as efficiently as possible and minimize the number of commands used. The code for arcade machines is usually highly optimized and all the juices are squeezed out of each extra byte and cycle.

Stacks

Let's talk a little more about assembly language. In any fairly complex computer program in assembler, subroutines are used. In most CPUs, there is a structure called the stack.

Imagine a stack in the form of a stack. If we need to save a number, we put it on the top of the stack. When we need to return it back, we take it from the top of the pile. Assembly language programmers call writing a number onto the stack “push”, and extracting a number using pop.

Suppose my program needs to call a subroutine. I can write similar code:

   0x1000move.l (sp), d0       ; записываем d0 в стек    
   0x1004add.lsp, #4         ; выполняем инкремент указателя стека    
   0x1008move.l (sp), d1       ; записываем d1 в стек 
   0x1010add.lsp, #4         ; и т.д.    
   0x1014move.l (sp), a0    
   0x1018add.lsp, #4    
   0x101Cmove.l (sp), a1    
   0x1020add.lsp, #4    
   0x1024move.l (sp), #0x1030  ; возвращаем адрес
   0x1028add.lsp, #4    
   0x102Cjmp#0x2040           ; адрес подпрограммы - 0x2040    
   0x1030move.la1, (sp)       ; восстанавливаем значения регистров
   0x1034sub.lsp, #4         ; в обратном порядке
   0x1038move.la0, (sp)       ; восстанавливаем значения регистров
   0x103csub.lsp, #4    
   и т.д.

The code shown above writes d0, d1, a0, and a1 to the stack. Most processors use a stack pointer. This can be a regular register, by convention used as a stack pointer, or a special register with functions for certain instructions.

In 68K series processors, the stack pointer is determined only by agreement, otherwise it is a normal register. In our 8080 processor, the SP register is a special register. He has PUSH and POP commands that write and eject from the stack in just one command.

In our emulator project, we will not write code from scratch. But if you need to analyze programs in assembly language, then it is good to learn how to recognize such constructs.

High level languages

When writing a program in a high-level language, all operations of saving and restoring registers are performed at each function call. We don’t think about them, because the compiler deals with them. Calling functions in a high-level language can take a lot of memory and CPU time.

Have you ever had a program crash when calling a subroutine in an infinite loop? This can occur because each function call pushes the values of the registers onto the stack, and at some point the stack runs out of memory. (If the stack grows too large, this is called a stack overflow, or stack overflow.)

You may have heard about inline functions. They allow you to avoid saving and restoring registers by including the code of the subroutine in the calling function. At the same time, the code becomes larger, but this saves several commands and read / write operations into memory.

Calling conventions

When writing a program in assembly language that calls only your code, you can decide for yourself how the subroutines will communicate with each other. For example, how do I return to the calling function after the subroutine is completed? One way is to write the return address to a specific register. Another is to place the return address at the top of the stack. Very often, the decision depends on what the processor supports. The 8080 has a CALL command that pushes the return address of a function onto the stack. You may use this 8080 command to implement subroutine calls.

It is necessary to take another decision. Is saving registers the responsibility of the calling function or subroutine? In the example shown above, the registers are saved by the calling function. But what if we have 32 registers? Saving and restoring 32 registers when the subroutine uses only a small part of them will be a waste of time.

Compromise can be a mixed approach. Suppose we have chosen a policy in which the subroutine can use the registers r10-r32 without saving their contents, but cannot destroy r1-r9. In this situation, the calling function knows the following:

When returning from a function, the contents of r1-r9 will remain unchanged.
I can not depend on the contents of r10-r32
If I need a value in r10-r32 after calling a subroutine, then before calling it I need to save it somewhere

Similarly, each subroutine knows the following:

I can destroy r10-r32
If I want to use r1-r9, then I need to save the contents and restore it before returning to the function that called me

ABI

On most modern platforms, such policies are created by engineers and published in documents called ABI (Application Binary Interface). Thanks to this document compiler creators know how to compile code that can invoke code compiled by other compilers. If you want to write an assembler code that can function in a similar environment, then you need to know ABI and write code in accordance with it.

ABI knowledge also helps in debugging code when you do not have access to the source code. ABI defines the locations of parameters for functions, so when reviewing any subroutine, you can examine these addresses to understand what is being passed to functions.

We return to the emulator

Most of the hand-written assembly code, especially for older processors and arcade games, does not follow ABI. Programs are coded in assembler and there may not be many subroutines. Each routine saves and restores registers only when absolutely necessary.

If you want to understand what the program is doing, it would be nice to start by tagging the addresses that are targeted for the CALL commands.

Tags: