How the computer works inside Hayabusa-2, which dropped a bomb on Ryuga. And photos of its developers
In this note, I will briefly describe what is included in the HR5000 SoC and its processor core, I will show photos of two of the key developers of the MIPS 4K and 5K lines, and also tell how you can play at home on the FPGA with the “descendant of a younger brother” of this computer - The 32-bit MIPS microAptiv UP kernel, whose code in the Verilog hardware description language was based on MIPS 4KEc.
The Japanese aerospace agency JAXA licensed the MIPS 5Kf processor core from MIPS Technologies, an American company. This happened back in the 2000s. The group that developed this core has existed in various configurations for 40 years:
- First, in 1978-1984, MIPS was a project at Stanford, led by John Hennessey. On the success of this project, Hennessey became the author of the most famous textbook on computer architecture and at some point - the president of Stanford.
- Then, in 1984, MIPS became a commercial company - MIPS Computer Systems. In the same year, ARM was also commercialized. In 1991, MIPS released the world's first 64-bit microprocessor - MIPS R4000.
- After that, MIPS was absorbed by Silicon Graphics and in the 1990s it was used inside graphic stations where the first films with realistic graphics were made in Hollywood (Jurassic Park).
- In the 2000s, the group separated into MIPS Technologies and, in particular, designed a processor for JAXA. MIPS was headquartered in California; some of the MIPS 5Kf developers were located at MIPS Europe in Copenhagen.
- In 2012, MIPS Technologies was bought by the British company Imagination Technologies, which became famous as a GPU developer inside the early Apple iPhone.
- In 2017, Apple threw Imagination, and after some disturbances, the technology and part of the MIPS group integrated into Wave Computing, a startup that develops a chip to accelerate neural networks.
- Чип Wave Computing является комбинацией из кластера 64-битных процессоров MIPS I6500, матричного умножителя на основе систолического массива а-ля Google TPU, а также процессора потоков данных (dataflow processor) на основе устройства с крупнозернистой реконфигурируемой архитектурой (Coarse-Grained Reconfigurable Architecture — CGRA). Классические процессоры в кластере I6500 загружают матричный умножитель и процессор потоков данных данными, матричный умножитель обеспечивает вычислительную плотность, а процессор потоков данных по задачам находится посередине между классическим процессоров и матричным умножителем — он более гибкий, чем умножитель, и более производителен, чем классический CPU.
So I took a picture with one of the two key developers of the MIPS 4K and 5K line - Larry Hudepohl, Larry Huedepol (on the right in a red shirt). Larry began his career at Digital Equipment Corporation (DEC) as a processor designer for MicroVAX. Then Larry worked for a small company Cyrix, which in the late 1980s challenged Intel and made a FPU coprocessor that was compatible with Intel 80387 and was 50% faster. Then Larry designed MIPS chips at Silicon Graphics. When MIPS Technologies separated from Silicon Graphics, Larry and Ryan Quinter together launched the first independent MIPS product, MIPS 4K, which became the backbone of the line that dominated the 2000s home electronics (DVD players, cameras, digital TVs). Then MIPS 5K flew into space - it was used by the Japanese space agency JAXA.
Now back to the processor in Hayabusa-2 (in Hayabusa-1 it is different). Here is a datasheet for the MIPS64 5Kf processor core and a page with system data on an HR5000 chip . Note some interesting points.
First of all, MIPS 5Kf is a pipelined processor. If you are unfamiliar with how this works, then the easiest way to get to know each other is to study the seventh chapter of the book “Digital Circuitry and Computer Architecture” by David M. Harris and Sarah L. Harris, the latest version of which can be downloaded to Russian here or here ) . The conveyor in MIPS 5Kf is different from the classic MIPS conveyor from Harris & Harris. Those of you who have read X&X can look at the differences and guess why:
Of course, MIPS 5Kf does not have five pipeline stages, but six, with an additional Dispatch stage. This stage is needed to make MIPS 5Kf boundedly superscalar. It can perform not only operations one after another in the pipeline, but it can also perform a floating-point operation simultaneously with an integer operation or with a memory operation (loading or saving). Stage Dispatch launches a floating point coprocessor that has its own seven-stage pipeline:
And here on the right in the photo is Darren Jones, Darren Jones, FPU developer at MIPS 5Kf. The letter “f” in “5Kf” means exactly that it has a floating point:
Here in this plate you can see how many cycles require different operations in the FPU and how often (repeat rate) they can be run in the pipeline. For example, single precision multiplication requires four cycles, but you can start a new multiplication in the pipeline every cycle. So the FPU can simultaneously process four single precision multiplications at each processing stage. But double precision multiplication requires five cycles, and you can start it only with a pause in the cycle. The complex operation of taking the square root of double precision requires as many as 32 cycles, and you can start a new taking of the square root only after 29 cycles. This is how the calculation of the coordinates of the ship and its motion formulas in outer space is optimized:
Hayabusa-2 uses the MIPS 5Kf configuration with separate 32-kilobyte instruction and data caches. At the same time , it is not clear from the brief description of HR5000 whether it uses a four-channel cache of 8 kilobytes - or a two-channel cache of 16 kilobytes. You can read how these caches work both in X&X and in my old presentation about caches , as well as in the useful book See MIPS Run Linux 2nd Edition by Dominic Sweetman:
There is also a Memory Management Unit on Hayabusa-2 - MMU), with a translation lookaside buffer (TLB). TLB is a universal tool for quickly converting addresses from virtual to physical. TLB allows you to:
Hide operating system memory from unprivileged code.
Protect user programs from each other.
Provide program access to the amount of virtual memory that exceeds the amount of physical RAM.
Address larger physical memory than virtual addresses are available.
Place the program in any part of the physical memory.
Allows multiple memory regions to look like a sequential piece.
Allows you to load pieces of the program from an external device as needed.
The TLB also associates various attributes with the address: read, write, and execute, as well as cache and coherence attributes.
The cache attribute is needed to show the processor where the address space is for the next cache level, and where for the I / O Resistors that cannot be cached.
Coherence attributes are needed for several processor cores to work together, each with its own first-level cache, and together they use a common second-level cache.
TLB can store an indicator that a page with a given address has been recorded. This helps when swapping, loading-unloading memory pages on systems with less physical memory than the application needs to address all pieces of its code and data with virtual addresses.
This is how the translation of a 64-bit virtual address into a 36-bit physical address on MIPS 5Kf looks like. Why on Hayabusa-2 a 64-bit processor with 36-bit physical addresses? I suspect that Hayabusa-2 takes photographs and must process images, which requires a lot of memory. Perhaps for some algorithms, 64-bit arithmetic and 64-bit cache exchanges (or 64-bit non-cacheable memory exchanges) improve something, and it turns out to be useful in space. But I don’t know for sure, I probably need to ask Zelenyikot and amartology , who know more about space than I do.
You can read about TLB in X&X and See MIPS Run, but there is a nuance: both books describe what TLB looks like from a programmer's point of view. But from the point of view of the hardware developer, the processor designers are deceiving the programmer by showing him the TLB as one associative translation table, despite the fact that there are actually three tables inside the TL: there are three micro-TLB instructions, micro-TLB data and a common (Joint TLB). First, the memory management device searches ITLB and DTLB, and only if it does not find it, it takes it from JTLB. This costs the processor an extra 2 cycles. Also see my old presentation on TLB :
The interface between the first level caches and the memory controller in MIPS 5Kf in Hayabusa-2 is called EB (pronounced IB). This is short for External Bus. It is similar to AHB and AXI, and allows you to burst, dump from the cache, or fill in the cache from memory a whole line, using transfers in sequential loops.
Outside the processor core, the HR5000 has an interrupt controller, a UART module, a direct memory access controller, timers and a PCI controller:
For operation in space, the chip must be protected from radiation. I am not a specialist in radiation protection, for this there is amartology on Habré, but I know that such protection can be done both at the level of physical production technology, and at the level of various ECC checks, and even at the level of architecture, with tripling, etc. The creators of the HR5000 crystal system decided to use the usual RTL2GDSII route adopted in commercial applications, synthesis of a graph from logical elements from code in the Verilog hardware description language. However, after receiving such a graph (netlist), they modify it using a special library of primitives hardness-by-design (HBD) (never used this, so any clarification in the comments is welcome):
Since MIPS 5Kf is written in Verilog, it can be turned not only into a netlist, and not only into a mask for manufacturing a chip in a factory, but also into an FPGA configuration. Unfortunately, the sources of MIPS 5Kf are not in the public domain, but in the public domain are the source code of a descendant of his "younger brother", a 32-bit MIPS 4K processor. This "descendant" is called MIPS microAptiv UP, and its basic configuration is included in the MIPSfpga package. The MIPS 4K / 4KEc / microAptiv UP / M5150 code (these are all progressive versions of the line) was also written by Larry, Ryan and Darren.
You can play with the pipeline, caches, memory management device and MIPS microAptiv UP kernel interrupts, run it on a simulator or board with FPGA / FPGA. To do this, just download the MIPS Open ™ FPGA Getting Started Package , along withMIPS Open ™ FPGA Labs , and (this is important!) Complement it with MIPSfpga + . In the latter there are labs about the pipeline, cache and memory management device .
You can synthesize and run the MIPS microAptiv UP processor on an inexpensive board for $ 85 (academic price is $ 55):
To work with the MIPSfpga / MIPSfpga + package, you need knowledge of the Verilog hardware description language, design principles at the register transfer level and the ability to write MIPS assembler.
The MIPS assembler is the easiest to learn. To do this, you can download the MARS simulator (MIPS Assembler and Runtime Simulator) . You can learn how to use it in 5 minutes, in fact, there are three buttons in it - assemble, run, run step by step:
Then you can spend the day practicing writing in assembly language for books Harris & Harris and See MIPS Run Linux .
If you don’t know anything at all about the development of digital circuits in general and in the language for describing equipment in particular, you can start with the Rosnanov online course for schoolchildren, in three parts: “From the transistor to the microcircuit” , “The logical side of digital circuitry” , “The physical side digital circuitry " ). Then you can study Verilog on X&X and understand that there is a processor on the simplified schoolMIPS processor .
If you are interested in this topic and want to participate in the work on MIPS Open(within the framework of which the MIPS microAptiv UP kernel was opened), write in the comments. Rosnanovtsy also hold a seminar for schoolchildren on digital design on April 17-19 , which will include, among other things, this space processor. Hayabusa-2 bombed Ryuga not in vain - this is also an occasion for Russian schoolchildren and students to find out what is inside her.
Only registered users can participate in the survey. Please come in.
Is it worth it to open the sources of the MIPS 5Kf processor core that is inside Hayabusa-2?
- 65.4% Yes 72
- 34.5% No, aliens through EJTAG will hack it and will send viruses to Earth 38