ru_vds May 14, 2019 at 12:30

LLVM in terms of Go

Transfer

Compiler development is a very difficult task. But, fortunately, with the development of projects like LLVM, the solution to this problem is greatly simplified, which allows even a single programmer to create a new language close in performance to C. Working with LLVM is complicated by the fact that this system is represented by a huge amount of code, equipped with small documentation. In order to try to correct this flaw, the author of the material we are publishing today is going to demonstrate examples of code written in Go and show how they are translated first to Go SSA and then to LLVM IR using the TinyGO compiler. The Go SSA and LLVM IR code has been slightly edited; something that is not relevant to the explanations given here has been removed from it in order to make these explanations more understandable.

First example

The first function I'm going to parse here is a simple mechanism for adding numbers:

func myAdd(a, b int) int{
    return a + b
}

This function is very simple, and, perhaps, nothing is easier not to come up with. It translates to the following Go SSA code:

func myAdd(a int, b int) int:
entry:
    t0 = a + b                                                    int
    return t0

With this representation of the function, hints about data types are placed on the right, in most cases you can ignore them.

This small example already allows you to see the gist of one aspect of SSA. Namely, when converting code to SSA form, each expression is divided into the most elementary parts of which it consists. In our case, the command return a + b, in fact, is two operations: the addition of two numbers and the return of the result.

In addition, here you can see the basic blocks of the program, in this code there is only one block - the input (entry block). We will talk more about the blocks below.

Go SSA code can be easily converted to LLVM IR:

define i64 @myAdd(i64 %a, i64 %b) {
entry:
  %0 = add i64 %a, %b
  ret i64 %0
}

You may notice that although other syntactic constructions are used here, the structure of the function has basically not changed. LLVM IR code is slightly stronger than Go SSA code, similar to C. Here, in the function declaration, first comes a description of the data type returned to it, the type of the argument is indicated before the argument name. In addition, to simplify IR-parsing, the names of global entities are preceded by a symbol @, and before the names of local entities , a character %(a function is also considered a global entity).

One of the features of this code that you need to pay attention to is that the decision about the representation of type Goint, which can be represented by a 32-bit or 64-bit value, depending on the compiler and the purpose of the compilation, is accepted when creating LLVM IR code. This is one of the many reasons that LLVM IR code is not platform independent. Such code created for one platform cannot simply be taken and compiled for another platform (unless you approach this task with extreme caution ).

Another interesting point worth noting is that the typei64- This is not a signed integer: it is neutral in terms of representing the sign of the number. Depending on the instruction, it can represent both signed numbers and unsigned numbers. In the case of representing the operation of addition, this does not play a role, so there is no difference in working with numbers with or without a sign. Here, I would like to note that in C the overflow of a nswsigned integer variable leads to undefined behavior, therefore, the Clang frontend adds a (no signed wrap) flag to the operation , which indicates to LLVM that it can proceed from the assumption that when added overflow never occurs.

This may be important for some optimizations. For example, adding two valuesi16on a 32-bit platform (with 32-bit registers), after completing additions, it needs a character expansion operation in order to remain in the range i16. Because of this, it is often more efficient to perform integer operations taking into account the machine size of the register.

What happens in the future with this IR code is not particularly interesting for us now. The code is optimized (but in the case of such a simple example as ours, nothing is already optimized), and then it is converted to machine code.

Second example

The following example, which we will examine, will be a bit more complicated. Namely, we are talking about a function that sums the slice of integers:

func sum(numbers []int) int {
    n := 0
    for i := 0; i < len(numbers); i++ {
        n += numbers[i]
    }
    return n
}

This code is converted to the following Go SSA code:

func sum(numbers []int) int:
entry:
    jump for.loop
for.loop:
    t0 = phi [entry: 0:int, for.body: t6] #n                       int
    t1 = phi [entry: 0:int, for.body: t7] #i                       int
    t2 = len(numbers)                                              int
    t3 = t1 < t2                                                  bool
    if t3 goto for.body else for.done
for.body:
    t4 = &numbers[t1]                                             *int
    t5 = *t4                                                       int
    t6 = t0 + t5                                                   int
    t7 = t1 + 1:int                                                int
    jump for.loop
for.done:
    return t0

Here you can already see more constructions specific to the presentation of code in the form of SSA. Probably the most obvious feature of this code is the fact that there are no structured flow control commands. To control the flow of calculations, there are only conditional and unconditional jumps, and, if we consider this command as a command for controlling the flow, a return command.

In fact, here you can pay attention to the fact that the program is not divided into blocks using curly braces (as in the languages of the C family). It is divided by labels, which resembles assembly languages, and is presented in the form of base blocks. In SSA, base blocks are contiguous sequences of code starting with a label and ending with instructions for completing a base block, for example, returnand jump.

Another interesting detail of this code is the instruction manual phi. The instruction is quite unusual, it may take some time to figure it out. Remember that SSA is short for Static Single Assignment. This is an intermediate representation of the code used by compilers, in which each variable is assigned only once. This is great for expressing simple functions, like our function myAddshown above, but not for more complex functions - like the functions discussed in this section sum. In particular, the variables iand change during the execution of the cycle n.

SSA bypasses the restriction on a single assignment of variable values using the so-called instructionphi(its name is taken from the Greek alphabet). The fact is that in order for the SSA-representation of the code could be formed for languages like C, you have to resort to some tricks. The result of calling this instruction is the current value of the variable ( ior n), and a list of base blocks is used as its parameters. For example, consider this instruction:

t0 = phi [entry: 0:int, for.body: t6] #n

Its meaning is as follows: if the previous base block was a block entry(input), then t0this is a constant 0, and if it was the previous base block for.body, then you need to take the value t6from this block. All this may look rather mysterious, but thanks to this mechanism, SSA is ensured. From a human point of view, all this complicates the understanding of the code, but the fact that each value is assigned only once greatly simplifies many optimizations.

Note that if you are writing your own compiler, then you usually do not have to deal with such things. Even Clang does not generate all these instructions phi, it uses a mechanismalloca(it resembles working with ordinary local variables). Then, when performing the LLVM optimization pass, called mem2reg , the instructions are allocaconverted to SSA form. TinyGo, however, receives input from Go SSA, which, conveniently, has already been converted to SSA form.

Another innovation of this piece of intermediate code is that access to the slice elements by index is presented in the form of the operation of calculating the address and the dereferencing operation of the obtained pointer. Here you can see the direct addition of constants to the IR-code (for example - 1:int). In the example with the function, myAddthis was not used. Now that we’ve figured out these features, we’ll take a look at what this code will turn into when converting it to LLVM IR form:

define i64 @sum(i64* %ptr, i64 %len, i64 %cap) {
entry:
  br label %for.loop
for.loop:                                         ; preds = %for.body, %entry
  %0 = phi i64 [ 0, %entry ], [ %5, %deref.next ]
  %1 = phi i64 [ 0, %entry ], [ %6, %deref.next ]
  %2 = icmp slt i64 %1, %len
  br i1 %2, label %for.body, label %for.done
for.body:                                         ; preds = %for.loop
  %3 = getelementptr i64, i64* %ptr, i64 %1
  %4 = load i64, i64* %3
  %5 = add i64 %0, %4
  %6 = add i64 %1, 1
  br label %for.loop
for.done:                                         ; preds = %for.loop
  ret i64 %0
}

Here, as before, we can see the same structure, which includes other syntactic constructions. For example, in calls phivalues and labels are swapped. However, there is also something worth paying special attention to.

For starters, here you can see a completely different signature of the function. LLVM does not support slices, as a result, in the form of optimization, the TinyGo compiler, which generated this intermediate code, divided the description of this data structure into parts. He could represent the three elements of the slice ( ptr, lenand cap) in the form of a structure (struct), but presenting them in the form of three separate entities allows for some optimizations. Other compilers may present the slice in some other way, it depends on the conventions for calling the functions of the target platform.

Another interesting feature of this code is the use of instructions getelementptr(often called GEP for short).

This instruction works with pointers and is used to get a pointer to a slice element. For example, let's compare it with the following code written in C:

int* sliceptr(int *ptr, int index) {
    return &ptr[index];
}

Or with the following equivalent to this:

int* sliceptr(int *ptr, int index) {
    return ptr + index;
}

The most important thing here is that the instruction getelementptrdoes not perform dereferencing operations. It only calculates a new pointer based on the existing one. It can be taken as instructions muland addon a hardware level. Details on GEP instructions can be found here .

Another interesting feature of this intermediate code is the use of instructions icmp. This is a general-purpose instruction used to implement integer comparisons. The result of the execution of this instruction is always a type value i1- a logical value. In this case, a comparison is performed using the keyword slt(signed less than), since we are comparing two numbers previously represented by the typeint. If we were to compare two unsigned integers, then we would use as an instruction icmp, and the keyword used in the comparison would be ult. To compare floating point numbers, another instruction is used fcmpthat works in a similar way.

Summary

I believe that in this article I examined the most important features of LLVM IR. Of course, there is still a lot of things. In particular, the intermediate representation of the code may contain a lot of annotations, allowing to take into account certain optimization features known to the compiler during optimization passes that cannot be expressed in IR in any other way. For example, these are the flag of inboundsthe GEP instruction, or the nswand flags nuw, which can be added to the instruction add. The same goes for the keyword privatethat indicates to the optimizer that the function marked by him will not be referenced from outside the current compilation unit. This allows you to perform many interesting interprocedural optimizations, such as eliminating unused arguments.

Details about LLVM can be found indocumentation that you will often refer to when developing your own LLVM-based compiler. Here is a guide that discusses compiler development for a very simple language. Both of these sources of information will come in handy when creating your own compiler.

Dear readers! Do you use LLVM?

Tags:

LLVM in terms of Go

First example

Second example

Summary

Also popular now: