nerumb June 2, 2017 at 13:09

Kotlin, bytecode compilation and performance (part 1)

About Kotlin, a lot has been said lately (especially in conjunction with the latest news from Google IO 17), but at the same time there is not much such necessary information, what Kotlin is compiled into.
Let's take a closer look at the compilation of the JVM bytecode.

This is the first part of the publication. The second one can be seen here.

The compilation process is a rather extensive topic, and to better reveal all its nuances, I took most of the compilation examples from a speech by Dmitry Zhemerov: Caught in the Act: Kotlin Bytecode Generation and Runtime Performance. All benchmarks are taken from the same performance. In addition to reading the publication, I highly recommend that you also watch his presentation. Some things are described in more detail there. I focus more on compiling the language.

Content:

Functions at the file level
Primary constructors
data classes
Properties in the class body
Not-null types in public and private methods
Extension functions
Method bodies in interfaces
Default arguments
Lambdas

But before we take a look at the main language constructions and in which bytecode they are compiled , you need to mention how the language itself compiles directly:

The source files arrive at the input of the kotlinc compiler, not only kotlin files, but also java files. This is necessary so that you can freely reference Java from Kotlin, and vice versa. The compiler itself perfectly understands the Java sources, but does not compile them, at this stage only the Kotlin files are compiled. After the received * .class files are transferred to the java compiler along with the * .java source files. At this stage, all java files are compiled, after which it becomes possible to collect all the files together in the jar (or in any other way).

In order to see which bytecode Kotlin is generated in Intellij IDEA, you can open a special window from Tools -> Kotlin -> Show Kotlin Bytecode. And then, when opening any * .kt file, its bytecode will be visible in this window. If there is nothing in it that cannot be imagined in Java, then the ability to decompile it into Java code with the Decompile button will also be available.

If you look at any * .class kotlin file, then you can see the big annotation @Metadata there:

@Metadata(
   mv = {1, 1, 6},
   bv = {1, 0, 1},
   k = 1,
   d1 = {"\u0000\u0014\n\u0002\u0018\u0002\n\u0002\u0010\u0000\n\u0002\b\u0002\n\u0002\u0010\b\n\u0002\b\u0003\u0018\u00002\u00020\u0001B\u0005¢\u0006\u0002\u0010\u0002R\u0014\u0010\u0003\u001a\u00020\u0004X\u0086D¢\u0006\b\n\u0000\u001a\u0004\b\u0005\u0010\u0006¨\u0006\u0007"},
   d2 = {"LSimpleKotlinClass;", "", "()V", "test", "", "getTest", "()I", "production sources for module KotlinTest_main"}
)

It contains all the information that exists in the Kotlin language, and which cannot be represented at the Java bytecode level. For example, information about properties, nullable types, etc. You do not need to work directly with this information, but the compiler works with it and can be accessed using the Reflection API. The metadata format is actually Protobuf with its declarations.
Dmitry Zhemerov

Let's now move on to examples in which we will consider the basic constructions and the way in which they are presented in bytecode. But in order not to understand bulky bytecode entries in most cases, consider a decompiled version in Java:

File Level Functions

Let's start with the simplest example: a file-level function.

//Kotlin, файл Example1.kt
fun foo() { }

There is no similar construction in Java. In bytecode, it is implemented by creating an additional class.

//Java
public final class Example1Kt {
   public static final void foo() {
   }
}

The name for the class is the name of the source file with the suffix * Kt (in this case, Example1Kt). There is also the option to change the class name using the file : JvmName annotation:

//Kotlin
@file:JvmName("Utils")
fun foo() { }

//Java
public final class Utils {
   public static final void foo() {
   }
}

Primary constructors

Kotlin has the ability to declare properties directly in the constructor header.

//Kotlin
class A(val x: Int, val y: Long) {}

They will be constructor parameters, fields will be generated for them and, accordingly, the values passed to the constructor will be written to these fields. A getter will also be created to allow these fields to be read. The decompiled version of the example above will look like this:

//Java
public final class A {
   private final int x;
   private final long y;
   public final int getX() {
      return this.x;
   }
   public final long getY() {
      return this.y;
   }
   public A(int x, long y) {
      this.x = x;
      this.y = y;
   }
}

If in the declaration of class A of the variable x change val to var, then setter will still be generated. It is also worth noting that class A will be declared with the final and public modifier. This is because all classes in Kotlin are final by default and have a public scope.

data classes

Kotlin has a special modifier for the data class.

//Kotlin
data class B(val x: Int, val y: Long) { }

This keyword tells the compiler to generate the equals, hashCode, toString, copy, and componentN functions for the class. The latter are needed so that the class can be used in destructing declarations. Let's look at the decompiled code:

//Java
public final class B {
   // --- аналогично примеру 2   
   public final int component1() {
      return this.x;
   }
   public final long component2() {
      return this.y;
   }
   @NotNull
   public final B copy(int x, long y) {
      return new B(x, y);
   }
    public String toString() {
      return "B(x=" + this.x + ", y=" + this.y + ")";
   }
   public int hashCode() {
      return this.x * 31 + (int)(this.y ^ this.y >>> 32);
   }
   public boolean equals(Object var1) {
      if(this != var1) {
         if(var1 instanceof B) {
            B var2 = (B)var1;
            if(this.x == var2.x && this.y == var2.y) {
               return true;
            }
         }
         return false;
      } else {
         return true;
      }
 }

In practice, the data modifier is very often used, especially for classes that participate in the interaction between components, or are stored in collections. Also, data classes allow you to quickly create an immutable container for data.

Properties in the class body

Properties can also be declared in the class body.

//Kotlin
class C {
    var x: String? = null
}

In this example, in class C, we declared an x property of type String, which may also be null. In this case, additional @Nullable annotations appear in the code:

//Java
import org.jetbrains.annotations.Nullable;
public final class C {
   @Nullable
   private String x;
   @Nullable
   public final String getX() {
      return this.x;
   }
   public final void setX(@Nullable String var1) {
      this.x = var1;
   }
}

In this case, in the decompiled version, we will see getter, setter (since the variable is declared with the var modifier). The @Nullable annotation is necessary so that those static analyzers that understand this annotation can check the code against them and report any possible mistakes.

If we do not need getter and setter, but just need a public field, then we can add the @JvmField annotation:

//Kotlin
class C {
    @JvmField var x: String? = null
}

Then the resulting Java code will be as follows:

//Java
public final class C {
   @JvmField
   @Nullable
   public String x;
}

Not-null types in public and private methods

In Kotlin, there is a slight difference between what bytecode is generated for public and private methods. Let's look at an example of two methods into which not-null variables are passed.

//Kotlin
class E {
    fun x(s: String) {
        println(s)
    }
    private fun y(s: String) {
        println(s)
    }
}

In both methods, a parameter s of type String is passed, and in both cases this parameter cannot be null.

//Java
import kotlin.jvm.internal.Intrinsics;
public final class E {
   public final void x(@NotNull String s) {
      Intrinsics.checkParameterIsNotNull(s, "s");
      System.out.println(s);
   }
   private final void y(String s) {
      System.out.println(s);
   }
}

In this case, an additional type check (Intrinsics.checkParameterIsNotNull) is generated for the public method, which checks that the parameter passed is really not null. This is so that public methods can be called from Java. And if null is suddenly passed in them, then this method should fall in the same place, without passing the variable further along the code. This is necessary for the early diagnosis of errors. In private methods there is no such check. From Java, it just cannot be called just if through reflection. But with reflection, you can break a lot of things if you wish. From Kotlin, the compiler itself monitors calls and does not allow null to be passed to such a method.

Such checks, of course, can not at all affect the performance. It’s quite interesting to measure how much they worsen it, but it’s hard to do with simple benchmarks. Therefore, let's look at the data that Dmitry Zhemerov managed to get:

Checking parameters for null

For one parameter, the cost of such a check on NotNull is generally negligible. For a method with eight parameters that does nothing else but checks for null, it already turns out that there is some noticeable cost. But in any case, in ordinary life this cost (approximately 3 nanoseconds) can be ignored. More likely the situation is that this is the last thing that will have to be optimized in the code. But if you still need to remove unnecessary checks, then at the moment this is possible using the additional options of the kotlinc compiler: -Xno-param-assertions and -Xno-call-assertions (important !: before disabling checks, really think about it the cause of your troubles, and whether there will be such that it will do more harm than good)

Extension functions

Kotlin allows you to extend the API of existing classes written not only in Kotlin, but also in Java. For any class, you can write a function declaration and later in the code you can use it with this class as if this function was when it was declared.

//Kotlin (файл Example6.kt)
class T(val i: Int)
fun T.foo(): Int {
	return i
}
fun useFoo() {
	T(1).foo()
}

In Java, a class is generated in which there will simply be a static method with a name, like an extension function. An instance of an extensible class is passed to this method. Thus, when we call the extension function, we actually pass the element itself to the static function on which we call the method.

//Java
public final class Example6Kt {
   public static final int foo(@NotNull T $receiver) {
      Intrinsics.checkParameterIsNotNull($receiver, "$receiver");
      return $receiver.getI();
   }
   public static final void useFoo() {
      foo(new T(1));
   }
}

Almost all of the Kotlin standard library consists of extension functions for JDK classes. Kotlin has a very small standard library and there is no declaration of its collection classes. All collections declared through listOf, setOf, mapOf, which in Kotlin look at first glance their own, are actually ordinary Java collections ArrayList, HashSet, HashMap. And if you need to transfer such a collection to the library (or from the library), then there is no overhead for converting to your inner classes (unlike Scala <-> Java) or copying.

Method bodies in interfaces

Kotlin has the ability to add an implementation for methods in interfaces.

//Kotlin
interface I {
    fun foo(): Int {
        return 42
    }
}
class D : I {  }

In Java 8, such an opportunity also appeared, but due to the fact that Kotlin should work in Java 6, the resulting code in Java looks like this:

public interface I {
   int foo();
   public static final class DefaultImpls {
      public static int foo(I $this) {
         return 42;
      }
   }
}
public final class D implements I {
   public int foo() {
      return I.DefaultImpls.foo(this);
   }
}

In Java, a regular interface is created with the method declaration, and the declaration of the DefaultImpls class appears with a default implementation for the required methods. In places where methods are used, a call to implementations appears from the class declared in the interface, into the methods of which the call object itself is passed.

The Kotlin team has plans for moving to implement this functionality using the default methods from Java 8, but at the moment there are difficulties with maintaining binary compatibility with already compiled libraries. You can see a discussion of this problem on youtrack . Of course, this does not create a big problem, but if the project is planning to create api for Java, then this feature should be taken into account.

Default arguments

Unlike Java, Kotlin has default arguments. But their implementation has been made quite interesting.

//Kotlin (файл Example8.kt)
fun first(x: Int = 11, y: Long = 22) {
    println(x)
    println(y)
}
fun second() {
    first()
}

To implement the default arguments in Java bytecode, a synthetic method is used, in which the mask bit mask is passed with information about which arguments are missing in the call.

//Java
public final class Example8Kt {
   public static final void first(int x, long y) {
      System.out.println(x);
      System.out.println(y);
   }
   public static void first$default(int var0, long var1, int mask, Object var4) {
      if((mask & 1) != 0) {
         var0 = 11;
      }
      if((mask & 2) != 0) {
         var1 = 22L;
      }
      first(var0, var1);
   }
   public static final void second() {
      first$default(0, 0L, 3, (Object)null);
   }
}

The only interesting point is why the var4 argument is generated? It is not used anywhere, and null is passed in places of use. I did not find information on the purpose of this argument, maybe yole can clarify the situation.

The cost estimates for such manipulations are shown below:

Default arguments

The cost of the default arguments is already becoming a little noticeable. But all the same, losses are measured in nanoseconds and during normal operation, such losses can be neglected. There is also a way to force the Kotlin compiler to generate default arguments in the bytecode differently. To do this, add the @JvmOverloads annotation:

//Kotlin
@JvmOverloads
fun first(x: Int = 11, y: Long = 22) {
    println(x)
    println(y)
}

In this case, in addition to the methods from the previous example, overloads of the first method will also be generated for various options for passing arguments.

//Java
public final class Example8Kt {
   //-- методы first, second, first$default из предыдущего примера
   @JvmOverloads
   public static final void first(int x) {
      first$default(x, 0L, 2, (Object)null);
   }
   @JvmOverloads
   public static final void first() {
      first$default(0, 0L, 3, (Object)null);
   }
}

Lambdas

Kotlin lambdas are presented in much the same way as in Java (except that they are first class objects)

//Kotlin (файл Lambda1.kt)
fun  runLambda(x: ()-> T): T = x()

In this case, the runLambda function accepts an instance of the Function0 interface (the declaration of which is in the Kotlin standard library), in which there is an invoke () function. And accordingly, this is all compatible with how it works in Java 8, and, of course, SAM conversion from Java works. The resulting bytecode will look like this:

//Java
public final class Lambda1Kt {
   public static final Object runLambda(@NotNull Function0 x) {
      Intrinsics.checkParameterIsNotNull(x, "x");
      return x.invoke();
   }
}

Compilation in bytecode is highly dependent on whether capturing a value from the surrounding context or not. Consider an example when there is a global variable value and a lambda that simply returns its value.

//Kotlin (файл Lambda2.kt)
var value = 0
fun noncapLambda(): Int = runLambda { value }

In Java, in this case, essentially, a singleton is created. The lambda itself does not use anything from the context and accordingly it is not necessary to create different instances for all calls. Therefore, a class that implements the Function0 interface is simply compiled, and, as a result, the lambda is called without allocation and is very cheap.

//Java
final class Lambda2Kt$noncapLambda$1 extends Lambda implements Function0 {
   public static final Lambda2Kt$noncapLambda$1 INSTANCE = new Lambda2Kt$noncapLambda$1()
  public final int invoke() {
    return Lambda2Kt.getValue();
  }
}
public final class Lambda2Kt {
   private static int value;
   public static final int getValue() {
      return value;
   }
   public static final void setValue(int var0) {
      value = var0;
   }
   public static final int noncapLambda() {
      return ((Number)Lambda1Kt.runLambda(Lambda2Kt$noncapLambda$1.INSTANCE)).intValue();
   }
}

Consider another example using local variables with contexts.

//Kotlin (файл Lambda3.kt)
fun capturingLambda(v: Int): Int = runLambda { v }

In this case, a singleton cannot be dispensed with, since each specific instance of a lambda must have its own parameter value.

//Java
public static final int capturingLambda(int v) {
      return ((Number)Lambda1Kt.runLambda((Function0)(new Function0() {
          public Object invoke() {
            return Integer.valueOf(this.invoke());
         }
         public final int invoke() {
            return v;
         }
      }))).intValue();
 }

Kotlin lambdas also know how to change the value of non-local variables (unlike Java lambdas).

//Kotlin (файл Lambda4.kt)
fun mutatingLambda(): Int {
    var x = 0
    runLambda { x++ }
    return x
}

In this case, a wrapper is created for the variable to be modified. The wrapper itself, similarly to the previous example, is passed to the created lambda, inside which the source variable changes through a call to the wrapper.

public final class Lambda4Kt {
   public static final int mutatingLambda() {
      final IntRef x = new IntRef();
      x.element = 0;
      Lambda1Kt.runLambda((Function0)(new Function0() {
         public Object invoke() {
            return Integer.valueOf(this.invoke());
         }
         public final int invoke() {
            int var1 = x.element++;
            return var1;
         }
      }));
      return x.element;
   }
}

Let's try to compare the performance of solutions in Kotlin, with analogues in Java:

Lambda

As you can see, fussing with wrappers (the last example) takes noticeable time, but, on the other hand, in Java this is not supported out of the box, and if you make a similar implementation with your hands, then the costs will be similar. Otherwise, the difference is not so noticeable.

Kotlin also has the ability to pass method references to lambdas, and they, unlike lambdas, store information about what methods indicate. Method references are compiled in a similar way to how lambdas look without capturing context. A singleton is created, which, in addition to the value, also knows what this lambda refers to.

Kotlin lambdas have another interesting feature: they can be declared with the inline modifier. In this case, the compiler will find all the places where the function is used in the code and replace them with the body of the function. JIT is also able to inline some things on its own, but you can never be sure that it will inline or miss. Therefore, it never hurts to have your own controlled inline engine.

//Kotin (файл Lambda5.kt)
fun inlineLambda(x: Int): Int = run { x }
//run это функция из стандартной библиотеки:
public inline fun  run(block: () -> R): R = block()

//Java
public final class Lambda5Kt {
   public static final int inlineLambda(int x) {
      return x;
   }
}

In the example above, there is no allocation, no calls. In fact, the function code simply “collapses”. This allows you to very effectively implement all sorts of filter, map, etc. The same synchronized statement is also inline.

Continued in Part 2

Thank you for your attention!
I hope you enjoyed the article. I ask all those who have noticed any errors or inaccuracies to write about this to me in a personal message.

Tags:

Kotlin, bytecode compilation and performance (part 1)

Content:

File Level Functions

Primary constructors

data classes

Properties in the class body

Not-null types in public and private methods

Checking parameters for null

Extension functions

Method bodies in interfaces

Default arguments

Default arguments

Lambdas

Lambda

Also popular now: