We study the L4 microkernel and write the “Hello world” application for the Xameleon system

    If you have ever studied the C language or come across a new development environment, then you have probably written at least once the simplest application that displays “Hello world”. So, one of the possible options in C is: Save this code to the “hello.c” file and use the gcc compiler to compile the executable file using the following command: As a result, if the compiler, header files and libraries are installed on your system, we get the executable file hello. Let's execute it: Elementary? Until you decide to build and run this application, for example, under the control of your own written operating system. Further, I will talk in detail about this process and bet that not everyone will find the strength to read the article to the end.

    #include
    int main(int argc, char * argv[], char * envp[])
    {
    puts("Hello world!");
    return 0;
    }


    gcc hello.c -o hello


    ./hello



    First, a little theory and simple things. Let's try to collect not an executable, but an object file. To do this, we need the following command:
    gcc -c hello.c
    as a result, we get the object file hello.o. What is typical for object files? Their code is positionally independent, the object files contain tables of imported and exported functions and variables, and, more interesting for us, the code is not very dependent on the software platform, but is tied to the processor architecture. Why this is important, I will tell you later, and now take a closer look at the contents of the object file using the following command:
    objdump -hxS hello.o

    Object file header Used sections and their size
    hello.o: file format elf32-i386
    hello.o
    architecture: i386, flags 0x00000011:
    HAS_RELOC, HAS_SYMS
    start address 0x00000000


    Sections:
    Idx Name Size VMA LMA File off Algn
    0 .text 0000002e 00000000 00000000 00000034 2**2
    CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
    1 .data 00000000 00000000 00000000 00000064 2**2
    CONTENTS, ALLOC, LOAD, DATA
    2 .bss 00000000 00000000 00000000 00000064 2**2
    ALLOC
    3 .rodata 0000000d 00000000 00000000 00000064 2**0
    CONTENTS, ALLOC, LOAD, READONLY, DATA
    4 .comment 00000012 00000000 00000000 00000071 2**0
    CONTENTS, READONLY
    5 .note.GNU-stack 00000000 00000000 00000000 00000083 2**0
    CONTENTS, READONLY

    The .text section contains the assembler code of the hello main function. As you can see from the example, the program code size is 46 bytes (0x24) The date section is empty, because our example does not use static and global variables that would be stored in this section. Finally, the 13-byte (0xd) .rodata section contains the string “Hello world!”.

    Table of exported and imported objects In this table, we are interested in the last two lines - a description of the main function, which is defined in our simplest program, and a description of the external puts function, which is defined elsewhere. In principle, if you compile this example under Linux, and the resulting source file is linked under FreeBSD, then most likely it will work without problems. True and vice versa. Now let's look at the assembler code of our program hello.c
    SYMBOL TABLE:
    00000000 l df *ABS* 00000000 hello.c
    00000000 l d .text 00000000 .text
    00000000 l d .data 00000000 .data
    00000000 l d .bss 00000000 .bss
    00000000 l d .rodata 00000000 .rodata
    00000000 l d .note.GNU-stack 00000000 .note.GNU-stack
    00000000 l d .comment 00000000 .comment
    00000000 g F .text 0000002e main
    00000000 *UND* 00000000 puts




    Disassembly of section .text:
    00000000
    :
      0:    8d 4c 24 04             lea    0x4(%esp),%ecx
      4:    83 e4 f0                 and    $0xfffffff0,%esp
      7:    ff 71 fc                 pushl -0x4(%ecx)
      a:    55                      push  %ebp
      b:    89 e5                    mov    %esp,%ebp
      d:    51                      push  %ecx
      e:    83 ec 04                 sub    $0x4,%esp
     11:    83 ec 0c                 sub    $0xc,%esp
     14:    68 00 00 00 00          push  $0x0
                15: R_386_32    .rodata
     19:    e8 fc ff ff ff          call  1a
                1a: R_386_PC32    puts
     1e:    83 c4 10                 add    $0x10,%esp
     21:    b8 00 00 00 00          mov    $0x0,%eax
     26:    8b 4d fc                 mov    -0x4(%ebp),%ecx
     29:    c9                      leave 
     2a:    8d 61 fc                 lea    -0x4(%ecx),%esp
     2d:    c3                      ret    

    * This source code was highlighted with Source Code Highlighter.
    Actually, this is the assembler code of our application created by the compiler. By the way, welcome to AT&T syntax style :)

    It was simple, and now let's move on to more complex things. The very first example in the article will create an executable file for your system. In my case, Slackware Linux. We know that the contents of the object file depend on the processor architecture, but only slightly depends on the operating system. How to create an executable file for the Xameleon system from the resulting object file? To do this, you need to link (click) the object file with the library functions of the Chameleon.

    In our example, the puts function, defined in the stdio.h header file, is used. What is the puts function? This is a function that outputs a line and a line feed to a standard input / output stream.
    For example, the puts function can be written like this: You can “endlessly” delve into the jungle of libc, but the purpose of the article is to show how this works in the Chameleon system. By the way, the above example of the puts function does not claim to be optimal, it only demonstrates an example of a simple function. You can only believe that the Chameleon libc is written more optimally. However, we digress from the topic of the story, so let's get back to business and take a close look at the write function. This function is defined in the POSIX standard, and it is it that implements the interaction of our example with the operating system. Refresh your memory with the command: man 2 write

    int puts(char * str)
    {
    int status, len;

    len = strlen(str); // Считаем длину строки
    status = write( 1, str, len ); // Пишем строку в стандартный поток вывода
    if( status == len ) status = write( 1, "\n", 1 ); // Если нет ошибки, добавляем перевод строки
    if( status ==1 ) status += len;
    return status;
    }



    #include
    ssize_t write(int fd, const void *buf, size_t count);

    Our example writes to file descriptor number 1, which by standard is nothing more than a descriptor for a standard output stream. The buf parameter is a pointer to a region of memory containing data for output (yes, there will be the address of the string “Hello world”). The third parameter is the size of the output data.

    What makes application and system programmers different?
    The application will say: "The write function outputs an array of data into an open file." The system will respond: "The write function passes the file descriptor to the system, a pointer to the data and the number of bytes to write." Both will be right, but I want to remind you what this blog is called. :)

    So, we understand that the library function write refers to the kernel of the operating system, so we need to understand how this happens. Since the Chameleon system is implemented on top of the L4 Pistachio microkernel, the system call is an IPC with two phases:
    1. Transfer phase - transfers to the file system service that serves the write function, the file descriptor and L4 line - a data type that describes the memory region.
    2. Reception phase - receives the status of the operation from the file system service.


    In the Chameleon documentation, the write system call is described as follows.

    In this way, the POSIX system write call translates to IPC, as shown in the figure above.

    The source code of the library function write, which, on the one hand, provides POSIX write (), on the other hand, provides interaction with the kernel of the operating system:

    ssize_t write(
      int            nFileDescriptor,
      const void      *  pBuffer,
      size_t          nBytesWrite )
    {
      int            nStatus;
      int            nChunk;
      int            nTotalSent;
      char        *  pPointer;

      L4_MsgTag_t       tag;
      L4_Msg_t        msg;
      L4_StringItem_t     SendString;

      nStatus = nTotalSent = 0;
      pPointer = (char*) pBuffer;

      while ( nBytesWrite )
      {
        nChunk = (nBytesWrite > 4096) ? 4096 : nBytesWrite;
        SendString = L4_StringItem ( nChunk, (void*) pPointer );

        L4_Clear( &msg );
        L4_Set_Label( &msg, fsWriteFile );
        L4_Append( &msg, nFileDescriptor );
        L4_Append( &msg, &SendString );
        L4_Load( &msg );
        tag = L4_Call( fs_service_id );
        if ( L4_IpcFailed(tag) )
        {
          nStatus = nTotalSent ? nTotalSent : XAM_EINTR;
          break;
        }
        else
        {
          L4_Store( tag, &msg );
          nStatus = L4_Get( &msg, 0 );
          if( nStatus < 0 ) break;
          nTotalSent += nStatus;
        }

        pPointer  += nChunk;
        nBytesWrite -= nChunk;
      }

      // POSIX workaround
      if( nStatus < 0 )
      {
        errno = nStatus;
        nStatus = -1;
      }
      else
      {
        nStatus = nTotalSent;
      }

      return nStatus;
    }

    * This source code was highlighted with Source Code Highlighter.


    It looks like something new from the greedy Chameleon developers. Let's take a closer look at the code and see how it corresponds to the WriteFile call from the documentation. The first thing that catches your eye is the limitation of the buffer size to 4 kilobytes. This restriction is related to the specifics of designing the file system module - it makes sense to transfer longer data to another system call, which provides temporary display of pages of the requesting process in the address space of the file system service. This feature goes far beyond the simple Hello world, so we will not consider it.
    The following data types are L4 Pistachio microkernel structures used for messaging
    • L4_MsgTag_t - message tag.
    • L4_Msg_t - a structure that carries message data.
    • L4_StringItem_t - line L4

    These structures are defined in the message.h header file, the L4 Pistachio microkernel.

    The following code prepares a message for transmission to the file system service: Next, a microkernel is called, which, in fact, provides Inter Process Communication Where fs_service_id is a variable of the L4_ThreadId_t type that contains the file system service identifier. How to get it, I'll tell you below, when we move on to the magic of CRT (C RunTime code). Now let's look at the code that analyzes the response from the file system service: If the IPC is broken, then check if the data was transferred at the previous iteration and generate the corresponding return code. If the IPC completes correctly, we process the return code.
    SendString = L4_StringItem ( nChunk, (void*) pPointer ); // Подготовить дескриптор данных
    L4_Clear( &msg ); // Очистить сообщение
    L4_Set_Label( &msg, fsWriteFile ); // Установить идентификатор системного вызова
    L4_Append( &msg, nFileDescriptor ); // Добавить к сообщению дескриптор записываемого файла
    L4_Append( &msg, &SendString ); // Добавлить к сообщению записываемые данные
    L4_Load( &msg ); // Готовит сообщение к передаче




    tag = L4_Call( fs_service_id );



    if ( L4_IpcFailed(tag) )
    {
    nStatus = nTotalSent ? nTotalSent : XAM_EINTR;
    break;
    }
    else
    {
    L4_Store( tag, &msg );
    nStatus = L4_Get( &msg, 0 );
    if( nStatus < 0 ) break;
    nTotalSent += nStatus;
    }



    A description of the functions of the functions used with the L4_ prefix can be seen in the header files of the L4 Pistachio microkernel.

    Difficult? I think yes. But we are already close to magic and bytessex. The time has come to look carefully at the fs_service_id variable. The Chameleon system is designed in such a way that initially the application does not know the identifier of the file system service, so it needs to be obtained somehow.

    The distribution of all resources, including processes, program threads (threads of execution) and memory, is the responsibility of the Supervisor process. One of his system calls allows you to get the service identifier by his name. The initial initialization code for libc functions that interact with the file system is as follows:

    static const char  szServiceName[3] = "fs"; // Внутреннее имя сервиса файловой системы

    L4_ThreadId_t  fs_service_id = L4_nilthread;

    extern "C" int xam_filesystem_init(void)
    {
      fs_service_id = GetDeviceHandle(szServiceName);
      return L4_IsNilThread(fs_service_id) ? XAM_ENODEV : 0;
    }


    * This source code was highlighted with Source Code Highlighter.


    Let's go even deeper into the code and see the implementation of the GetDeviceHandle function, which returns the identifier of the requested service.

    extern L4_ThreadId_t  rootserver_id; // Главный обработчик Supervisor'а

    L4_ThreadId_t GetDeviceHandle(const char * szDeviceName)
    {
      L4_MsgTag_t           tag;
      L4_Msg_t            msg;
      L4_ThreadId_t          Handle;

      Handle = L4_nilthread;
      
      do {
      
        L4_Clear(&msg);
        L4_Set_Label(&msg, cmdGetDeviceHandle );
        L4_Append(&msg, L4_StringItem( 1+strlen(szDeviceName), (void*) szDeviceName) );
        L4_Load(&msg);
        tag = L4_Call( rootserver_id );
        if( L4_IpcFailed(tag) ) break;
        L4_Store( tag, &msg );
        Handle.raw = L4_Get(&msg, 1);
        
      } while( false );

      return Handle;
    }

    * This source code was highlighted with Source Code Highlighter.


    You can draw an analogy with the previous example, but there is still a difference - this is the variable rootserver_id, which contains the identifier of the Supervisor. Since the functions xam_filesystem_init and GetDeviceHandle are called from the CRT, the Supervisor identifier must be obtained before the library is initialized.

    How can an application get the Supervisor ID? We have come very close to bytex, so let's look at a data structure called the Kernel Interface Page (KIP). This structure is described in the L4 Pistachio microdraft specification and is as follows:


    Since the supervisor is the first user process from the point of view of the microkernel, its identifier can be obtained based on the ThreadInfo field from KIP. The peculiarity of KIP is that the microkernel holds this page in a single copy, but maps it to the address space of each process. To get the KIP address, the process must run the following sequence of commands:

        lock;    nop
        mov    %eax, kip


    The sequence of assembler lock and nop commands will raise an exception that will catch the microkernel and substitute the Kernel Interface Page address in the EAX register before returning from the exception.

    Finally, the final touch is finding the identifier of the Supervisor serving stream, based on the data received from the Kernel Interface Page.

      mov  kip, %eax
      movw  198(%eax), %ax
      shrw  $4, %ax
      movzwl  %ax, %eax
      addl  $2, %eax
      sall  $14, %eax
      orl  $1, %eax
      movl  %eax, rootserver_id


    Thus, the CRT0 module at the time of starting the program initializes library functions that exchange with various services of the Chameleon system.

    Dear readers, it is very surprising to me that you did not get a dream, that you did not close the browser window and found the strength to read to this place. You will probably be interested to “touch” the latest version of the Xameleon System Developer Toolkit .

    Thanks for attention.

    Also popular now: