NWOcs March 18, 2013 at 18:10

How to run a program without an operating system

It turned out that in our article describing the mechanism for polling a PCI bus , the most important thing was not described in detail: how to run this code on real hardware? How to create your own boot disk? In this article, we will answer all these questions in detail (these questions were partially discussed in the previous article, but for the convenience of reading we allow ourselves a little duplication of the material).

There are a lot of descriptions and tutorials on the Internet about how to write your own mini-OS, even there are hundreds of ready-made little hobby OSs. One of the most worthy resources on this subject, which I would like to highlight, is the osdev.org portal. To complement the previous article about PCI (and the ability to write subsequent articles about various functions that are present in any modern OS), we will describe step-by-step instructions on creating a boot disk with the usual C language program. We tried to write as detailed as possible so that everything could be sort it out yourself.

So, the goal: having spent as little effort as possible, create your own bootable USB flash drive, which simply prints the classic “Hello World” on the computer screen.

To be more precise, we need to "get" into protected mode with disabled page addressing and interrupts - the simplest processor mode with the usual behavior for a simple console program. The most reasonable way to achieve this goal is to build a kernel that supports the multiboot format and load it using the popular Grub bootloader. An alternative to this solution is to write your own volume boot record (VBR), which would load your own written bootloader (loader). A decent bootloader, at least, should be able to work with the disk, with the file system, and parse elf images. This means that you need to write a lot of assembler code, and a lot of C code. In short, it’s easier to use Grub, which already knows how to do everything necessary.

To begin with, a certain set of compilers and utilities is required for further actions. The easiest way is to use some kind of Linux (for example, Ubuntu), since it will already contain everything you need to create a bootable flash drive. If you are used to working in Windows, you can configure a virtual machine with Linux (using Virtual Box or VMware Workstation).

If you are using Linux Ubuntu, then first of all you need to install several utilities:
1. Grub. To do this, use the command:

sudo apt-get install grub

2. Qemu. It is needed to quickly test and debug everything , for this the command is similar:

sudo apt-get install qemu

Now our plan looks like this:
1. create a C program that prints a line on the screen.
2. compile from it an image (kernel.bin) in miniboot format so that it is available for download using GRUB.
3. Create a boot disk image file and format it.
4. Install Grub on this image.
5. copy the created program (kernel.bin) to the disk.
6. write the image to physical media or run it in qemu.

and the system boot process:

To make it work, you will need to create several files and directories:

kernel.c	Program code written in C. The program prints a message on the screen.
makefile	Makefile, a script that runs the entire assembly of the program and creates a boot image.
linker.ld	Linker script for the kernel.
loader.s	Assembler code that is called by Grub and transfers control to the main function from a C program.
include /	Folder with header files.
grub /	Grub file folder.
common /	General purpose folder. Including the implementation of printf.

Step 1. Creating the code of the target program (kernel):

Create a kernel.c file that will contain the following code that prints a message on the screen:

#include "printf.h"
#include "screen.h"
#include "types.h"
void main(void)
{   
    clear_screen();
    printf("\n>>> Hello World!\n");
}

Everything is familiar and simple here. Adding the printf and clear_screen functions will be discussed later. In the meantime, you need to supplement this code with everything necessary so that it can be loaded by Grub.
In order for the kernel to be in multiboot format, you need the following structure in the first 8 kilobytes of the kernel image:

0x1BADB002 = MAGIC	Multiboot Signature
0x0 = FLAGS	Flags that contain additional requirements for loading the kernel and parameters passed by the loader to the kernel (our program). In this case, all flags are reset.
0xE4524FFE = - (MAGIC + FLAGS)	Check sum.

If all these conditions are met, then Grub passes through the% eax and% ebx registers a pointer to the multiboot Information structure and the value 0x1BADB002, respectively. The multiboot Information structure contains various information, including a list of loaded modules and their location, which may be necessary for further loading of the system.
In order for the program file to contain the necessary signatures, create a loader.s file with the following contents:


    .text
    .global loader                   # making entry point visible to linker
    # setting up the Multiboot header - see GRUB docs for details
    .set FLAGS,    0x0               # this is the Multiboot 'flag' field
    .set MAGIC,    0x1BADB002        # 'magic number' lets bootloader find the header
    .set CHECKSUM, -(MAGIC + FLAGS)  # checksum required
    .align 4
    .long MAGIC
    .long FLAGS
    .long CHECKSUM
# reserve initial kernel stack space
    .set STACKSIZE, 0x4000           # that is, 16k.
    .lcomm stack, STACKSIZE          # reserve 16k stack
    .comm  mbd, 4                    # we will use this in kmain
    .comm  magic, 4                  # we will use this in kmain
loader:
    movl  $(stack + STACKSIZE), %esp # set up the stack
    movl  %eax, magic                # Multiboot magic number
    movl  %ebx, mbd                  # Multiboot data structure
    call  main                      # call C code
    cli
hang:
    hlt                              # halt machine should kernel return
    jmp   hang

Consider the code in more detail. This code, almost unchanged, is taken from wiki.osdev.org/Bare_Bones . Since gcc is used for compilation, the GAS syntax is used. Let's take a closer look at what this code does.

.text

All subsequent code will fall into the executable .text section.

.global loader

We declare the loader symbol visible to the linker. This is required since the linker will use the loader as an entry point.

    .set FLAGS,    0x0               # присвоить FLAGS = 0x0
    .set MAGIC,    0x1BADB002       	# присвоить MAGIC = 0x1BADB002       	
    .set CHECKSUM, -(MAGIC + FLAGS) 	# присвоить CHECKSUM = -(MAGIC + FLAGS)
    .align 4			# выровнять последующие данные по 4 байта
    .long MAGIC			# разместить по текущему адресу значение MAGIC
    .long FLAGS			# разместить по текущему адресу значение FLAGS
    .long CHECKSUM			# разместить по текущему адресу значение CHECKSUM

This code forms the signature of the Multiboot format. The .set directive sets the value of a character to the expression to the right of the comma. The .align 4 directive aligns subsequent contents to 4 bytes. The .long directive stores the value in the next four bytes.

 .set STACKSIZE, 0x4000   # присвоить STACKSIZE = 0x4000
    .lcomm stack, STACKSIZE  # зарезервировать STACKSIZE байт. stack ссылается на диапазон
    .comm  mbd, 4            # зарезервировать 4 байта под переменную mdb в области COMMON
    .comm  magic, 4          # зарезервировать 4 байта под переменную magic в области COMMON

During the boot process, grub does not configure the stack, and the first thing the kernel should do is configure the stack, for this we reserve 0x4000 (16Kb) bytes. The .lcomm directive reserves in the .bss section the number of bytes specified after the decimal point. The name stack will only be visible in the compiled file. The .comm directive does the same as .lcomm, but the symbol name will be declared globally. This means that by writing the following line in C code, we can use it.
extern int magic

And now the last part:

loader:
    movl  $(stack + STACKSIZE), %esp 	# инициализировать стек
    movl  %eax, magic                # записать %eax по адресу magic
    movl  %ebx, mbd                  # записать %ebx по адресу mbd
    call  main                       # вызвать функцию main
    cli				# отключить прерывания от оборудования
hang:
    hlt                     # остановить процессор пока не возникнет прерывание
    jmp   hang		# прыгнуть на метку hang

The first instruction saves the value of the top of the stack in the% esp register. As the stack grows down, the address of the end of the range allocated for the stack is written in% esp. The two subsequent instructions store in the previously reserved ranges of 4 bytes the values that Grub passes in the% eax,% ebx registers. Then the main function, which is already written in C, is called. If you return from this procedure, the processor will loop.

Step 2. Preparation of additional code for the program (system library):

Since the entire program is written from scratch, the printf function must be written from scratch. To do this, prepare several files.
Create a common and include folder:

mkdir common
mkdir include

Create a file common \ printf.c that will contain the implementation of the familiar printf function. This entire file can be taken from the www.bitvisor.org project . The path to the file in the sources of bitvisor: core / printf.c. In the printf.c file copied from bitvisor, for use in the target program, you need to replace the lines:

#include "initfunc.h"
#include "printf.h"
#include "putchar.h"
#include "spinlock.h"

per line:

#include "types.h"
#include "stdarg.h"
#include "screen.h"

Then, remove the printf_init_global function and all its references in this file:

static void
printf_init_global (void)
{
    spinlock_init (&printf_lock);
}
INITFUNC ("global0", printf_init_global);

Then delete the printf_lock variable and all its references in this file:


static spinlock_t printf_lock;
…
spinlock_lock (&printf_lock);
…
spinlock_unlock (&printf_lock);

The printf function uses the putchar function, which also needs to be written. To do this, create the file common \ screen.c, with the following contents:


#include "types.h"
#define GREEN    0x2
#define MAX_COL  80		// Maximum number of columns 
#define MAX_ROW  25		// Maximum number of rows 
#define VRAM_SIZE (MAX_COL*MAX_ROW)	// Size of screen, in short's 
#define DEF_VRAM_BASE 0xb8000	// Default base for video memory
static unsigned char curr_col = 0;
static unsigned char curr_row = 0;
// Write character at current screen location
#define PUT(c) ( ((unsigned short *) (DEF_VRAM_BASE)) \
	[(curr_row * MAX_COL) + curr_col] = (GREEN << 8) | (c))
// Place a character on next screen position
static void cons_putc(int c)
{
    switch (c) 
    {
    case '\t':
        do 
        {
            cons_putc(' ');
        } while ((curr_col % 8) != 0);
        break;
    case '\r':
        curr_col = 0;
        break;
    case '\n':
        curr_row += 1;
        if (curr_row >= MAX_ROW) 
        {
            curr_row = 0;
        }
        break;
    case '\b':
        if (curr_col > 0) 
        {
            curr_col -= 1;
            PUT(' ');
        }
        break;
    default:
        PUT(c);
        curr_col += 1;
        if (curr_col >= MAX_COL) 
        {
            curr_col = 0;
            curr_row += 1;
            if (curr_row >= MAX_ROW) 
            {
                curr_row = 0;
            }
        }
    };
}
void putchar( int c )
{
    if (c == '\n') 
        cons_putc('\r');
    cons_putc(c);
}
void clear_screen( void )
{
    curr_col = 0;
    curr_row = 0;
    int i;
    for (i = 0; i < VRAM_SIZE; i++)
        cons_putc(' ');
    curr_col = 0;
    curr_row = 0;
}

The specified code contains simple logic for printing characters to the screen in text mode. In this mode, two bytes are used to record the character (one with the character code, the other with its attributes), written directly to the video memory displayed immediately on the screen and starting with the address 0xB8000. The screen resolution is 80x25 characters. The character is directly printed using the PUT macro.
Now only a few header files are missing:
1. The include \ screen.h file. Declares a putchar function, which is used in the printf function. File contents:


#ifndef _SCREEN_H
#define _SCREEN_H
void clear_screen( void );
void putchar( int c );
#endif

2. The file include \ printf.h. Declares the printf function, which is used in main. File contents:

#ifndef _PRINTF_H
#define _PRINTF_H
int printf (const char *format, ...); 
#endif

3. The file include \ stdarg.h. Declares functions for enumerating arguments, the number of which is not known in advance. The entire file is taken from the www.bitvisor.org project . The path to the file in the bitvisor project code: include \ core \ stdarg.h.
4. The file include \ types.h. Declares NULL and size_t. File contents:

#ifndef _TYPES_H
#define _TYPES_H
#define NULL 0
typedef unsigned int size_t;
#endif

Thus, the include and common folders contain the minimum system library code that any program needs.

Step 3. Creating a script for the linker:

We create the linker.ld file, which will be used by the linker to generate the target program file (kernel.bin). The file should contain the following:

ENTRY (loader)
LMA = 0x00100000;
SECTIONS
{
    . = LMA;
    .multiboot ALIGN (0x1000) :   {  loader.o( .text ) }
    .text      ALIGN (0x1000) :   {  *(.text)          }
    .rodata    ALIGN (0x1000) :   {  *(.rodata*)       }
    .data      ALIGN (0x1000) :   {  *(.data)          }
    .bss :                        {  *(COMMON) *(.bss) }
    /DISCARD/ :                   {  *(.comment)       }
}

The built-in function ENTRY () allows you to set the entry point for our kernel. It is at this address that grub will pass control after the kernel boots. Using this script, the linker will create a binary file in ELF format. An ELF file consists of a set of segments and sections. The list of segments is contained in the Program header table, the list of sections in the Section header table. The linker operates with sections, the image loader (in our case, GRUB) with segments.

As you can see in the figure, the segments consist of sections. One of the fields describing the section is the virtual address at which the section should be at the time of execution. In fact, a segment has 2 fields that describe its location: the virtual address of the segment and the physical address of the segment. The virtual address of the segment is the virtual address of the first byte of the segment at the time the code is executed, the physical address of the segment is the physical address at which the segment should be loaded. For applications, these addresses always match. Grub downloads image segments at their physical address. Since Grub does not configure page addressing, the virtual address of the segment must match its physical address, since in our program virtual memory is also not configured.

SECTIONS

Says that sections are further described.

. = LMA;

This expression tells the linker that all subsequent sections are after the LMA address.

 ALIGN (0x1000)

The directive above means that the section is aligned to 0x1000 bytes.

.multiboot ALIGN (0x1000) : {  loader.o( .text )  }

A separate multiboot section, which includes the .text section from the loader.o file, is designed to ensure that the signature of the multiboot format gets into the first 8kb of the kernel image.

.bss :  { *(COMMON) *(.bss) }

* (COMMON) is the area in which memory is reserved by the instructions .comm and .lcomm. We place it in the .bss section.

/DISCARD/ : { *(.comment) }

All sections marked DISCARD are removed from the image. In this case, we delete the .comment section, which contains information about the version of the linker.

Now compile the code into a binary file with the following commands:

as -o loader.o loader.s
gcc -Iinclude -Wall -fno-builtin -nostdinc -nostdlib -o kernel.o -c kernel.c
gcc -Iinclude -Wall -fno-builtin -nostdinc -nostdlib -o printf.o -c common/printf.c
gcc -Iinclude -Wall -fno-builtin -nostdinc -nostdlib -o screen.o -c common/screen.c
ld -T linker.ld -o kernel.bin kernel.o screen.o printf.o loader.o

Using objdump, we’ll look at what the kernel image looks like after linking:


objdump -ph ./kernel.bin

As you can see, sections in the image coincide with those that we described in the linker script. The linker formed 3 segments from the described sections. The first segment includes the sections .multiboot, .text, .rodata and has a virtual and physical address 0x00100000. The second segment contains the .data and .bss sections and is located at 0x00104000. So everything is ready to download this file using Grub.

Step 4. Preparing the
grub bootloader: Create the grub folder:

mkdir grub

Copy several Grub files to this folder that are necessary for installing it on the image (the following files exist if Grub is installed on the system). To do this, run the following commands:


cp /usr/lib/grub/i386-pc/stage1 ./grub/
cp /usr/lib/grub/i386-pc/stage2 ./grub/
cp /usr/lib/grub/i386-pc/fat_stage1_5 ./grub/

Create a grub / menu.lst file with the following contents:

timeout   3
default   0
title  mini_os
root   (hd0,0)
kernel /kernel.bin

Step 5. Automate and create a boot image:

To automate the build process, we will use the make utility. To do this, create a makefile that will compile to compile the source code, compile the kernel and create a boot image. The makefile should have the following contents:

CC      = gcc
CFLAGS  = -Wall -fno-builtin -nostdinc -nostdlib
LD      = ld
OBJFILES = \
	loader.o  \
	common/printf.o  \
	common/screen.o  \
	kernel.o
image:
	@echo "Creating hdd.img..."
	@dd if=/dev/zero of=./hdd.img bs=512 count=16065 1>/dev/null 2>&1
	@echo "Creating bootable first FAT32 partition..."
	@losetup /dev/loop1 ./hdd.img
	@(echo c; echo u; echo n; echo p; echo 1; echo ;  echo ; echo a; echo 1; echo t; echo c; echo w;) | fdisk /dev/loop1 1>/dev/null 2>&1 || true
	@echo "Mounting partition to /dev/loop2..."
	@losetup /dev/loop2 ./hdd.img \
    --offset    `echo \`fdisk -lu /dev/loop1 | sed -n 10p | awk '{print $$3}'\`*512 | bc` \
    --sizelimit `echo \`fdisk -lu /dev/loop1 | sed -n 10p | awk '{print $$4}'\`*512 | bc`
	@losetup -d /dev/loop1
	@echo "Format partition..."
	@mkdosfs /dev/loop2
	@echo "Copy kernel and grub files on partition..."
	@mkdir -p tempdir
	@mount /dev/loop2 tempdir
	@mkdir tempdir/boot
	@cp -r grub tempdir/boot/
	@cp kernel.bin tempdir/
	@sleep 1
	@umount /dev/loop2
	@rm -r tempdir
	@losetup -d /dev/loop2
	@echo "Installing GRUB..."
	@echo "device (hd0) hdd.img \n \
	       root (hd0,0)         \n \
	       setup (hd0)          \n \
	       quit\n" | grub --batch 1>/dev/null
	@echo "Done!"
all: kernel.bin
rebuild: clean all
.s.o:
	as -o $@ $<
.c.o:
	$(CC) -Iinclude $(CFLAGS) -o $@ -c $<
kernel.bin: $(OBJFILES)
	$(LD) -T linker.ld -o $@ $^
clean:
	rm -f $(OBJFILES) hdd.img kernel.bin

Two main goals are declared in the file: all - compiles the kernel, and image - which creates a boot disk. The all target, like the usual makefile, contains subgoals of .so and .co, which compile * .s and * .c files into object files (* .o), as well as a target for generating kernel.bin, which calls the linker with the script created earlier. These goals perform exactly the same commands that are specified in step 3.
Of greatest interest here is the creation of the boot image hdd.img (target image). Let us consider in stages how this happens.

dd if=/dev/zero of=./hdd.img bs=512 count=16065 1>/dev/null 2>&1

This command creates an image with which further work will take place. The number of sectors was not chosen by chance: 16065 = 255 * 63. By default, fdsik works with the disk as if it had CHS geometry in which Headers (H) = 255, Sectors (S) = 63, and Cylinders (C) depends on disk size. Thus, the minimum disk size that fdsik can work with without changing the default geometry is 512 * 255 * 63 * 1 = 8225280 bytes, where 512 is the sector size and 1 is the number of cylinders.
Next, a partition table is created:

losetup /dev/loop1 ./hdd.img
(echo c; echo u; echo n; echo p; echo 1; echo ;  echo ; echo a; echo 1; echo t; echo c; echo w;) | fdisk /dev/loop1 1>/dev/null 2>&1 || true

The first command mounts the hdd.img file to the block device / dev / loop1, allowing you to work with the file as a device. The second command creates a partition table on the device / dev / loop1, in which there is 1 primary boot partition of the disk, which occupies the entire disk, with the FAT32 file system label.
Then we format the created section. To do this, mount it as a block device and perform formatting.

losetup /dev/loop2 ./hdd.img \
    --offset    `echo \`fdisk -lu /dev/loop1 | sed -n 10p | awk '{print $$3}'\`*512 | bc` \
    --sizelimit `echo \`fdisk -lu /dev/loop1 | sed -n 10p | awk '{print $$4}'\`*512 | bc`
losetup -d /dev/loop1

The first command mounts the previously created partition to the device / dev / loop2. The –offset option indicates the address of the beginning of the section, and –sizelimit the address of the end of the section. Both parameters are obtained using the fdisk command.

mkdosfs /dev/loop2

The mkdosfs utility formats a partition into the FAT32 file system.
For direct assembly of the kernel, the previously discussed commands in the classic makefile syntax are used.
Now consider how to install GRUB on a partition:

mkdir -p tempdir		# создает временную директорию
mount /dev/loop2 tempdir	# монтирует раздел в директорию
mkdir tempdir/boot		# создает директорию /boot на разделе
cp -r grub tempdir/boot/	# копируем папку grub в /boot
cp kernel.bin tempdir/	# копирует ядро в корень раздела
sleep 1   		# ждем Ubuntu
umount /dev/loop2		# отмонтируем временную папку
rm -r tempdir		# удаляем временную папку
losetup -d /dev/loop2	# отмонтируем раздел

After executing the above commands, the image will be ready to install GRUB. The following command installs GRUB in the MBR of the hdd.img disk image.

echo "device (hd0) hdd.img \n \
      root (hd0,0)         \n \
      setup (hd0)          \n \
      quit\n" | grub --batch 1>/dev/null

Everything is ready for testing!

Step 6. Launch:

To compile, use the command:

make all

After which the kernel.bin file should appear.
To create a bootable disk image, use the command:

sudo make image

As a result, the hdd.img file should appear.
Now you can boot from the hdd.img disk image. You can verify this with the following command:

qemu -hda hdd.img -m 32

or:

qemu-system-i386 -hda hdd.img

To check on a real machine, you need to make dd of this image on a flash drive and boot from it. For example, with this command:

sudo dd if=./hdd.img of=/dev/sdb

Summing up, we can say that as a result of the actions taken, we get a set of sources and scripts that allow us to conduct various experiments in the field of system programming. The first step has been taken towards creating system software such as hypervisors and operating systems.

Links to the following articles in the series:
" How to run a program without an operating system: part 2 "
" How to run a program without an operating system: part 3: Graphics "
" How to run a program without an operating system: part 4. Parallel computing "
" How to run a program without an operating system systems:part 5. Accessing the BIOS from the OS "
" How to run a program without an operating system: part 6. Support for working with disks with the FAT file system "

Tags: