Multi-release JARs - Bad or Good?

Transfer

From the translator: we are actively working on translating the platform onto rails of Java 11 and are thinking about how to effectively develop Java libraries (such as YARG ), taking into account the features of Java 8/11 , so that we do not have to make separate branches and releases. One of the possible solutions is a multi-release JAR, but even this is not all smooth.

Java 9 includes a new Java runtime option called multi-release JARs. This is probably one of the most controversial innovations in the platform. TL; DR: we consider this a crooked solution to a serious problem . In this post, we will explain why we think so, and also tell you how to build such a JAR if you really want.

Multi-release JARs , or MR JARs, is a new feature of the Java platform, introduced in JDK 9. Here we will explain in detail about the significant risks associated with using this technology, and how to create multi-release JARs using Gradle, if you still want.

In fact, a multi-release JAR is a Java archive that includes several variants of the same class for working with different versions of the execution environment. For example, if you are working in JDK 8, the Java environment will use the class version for JDK 8, and if in Java 9, the version for Java 9 is used. Similarly, if the version is created for a future release of Java 10, runtime uses this version instead of the Java version 9 or the default version (java 8).

Under the cut, we understand the device of the new JAR format and find out if this is all necessary.

When to use multi-release JARs

Optimized runtime. This is a solution to the problem faced by many developers: when developing an application, it is not known in which environment it will be executed. However, for some versions of the runtime, you can embed generic versions of the same class. Suppose you want to display the version number of the Java in which the application is running. For Java 9, you can use the Runtime.getVersion method. However, this is a new method, available only in Java 9+. If you need other runtimes, say, Java 8, you will have to parse the java.version property. As a result, you will have 2 different implementations of one function.
Conflicting APIs: resolving conflicts between APIs is also a common problem. For example, you need to support two runtimes, but in one of them the API is deprecated. There are 2 common solutions to this problem:
1. The first is reflection. For example, you can set the VersionProvider interface, then 2 specific classes Java8VersionProvider and Java9VersionProvider, and load the corresponding class into runtime (funny that you have to parse the version number to choose between two classes!). One of the variants of this solution is the creation of a single class with various methods, which are evoked through reflection.
2. A more advanced solution is to use Method Handles where possible. Most likely, the reflection will seem to you brake and uncomfortable, and, in general, the way it is.

Known alternatives to multi-release JARs

The second way, which is simpler and easier to understand, is to create 2 different archives for different runtimes. In theory, you create two implementations of the same class in the IDE, and compiling, testing and correctly packaging them into 2 different artifacts is the task of the build system. This is the approach that has been used in Guava or Spock over the years. But it is also required for languages such as Scala. And all because there are so many options for the compiler and runtime, that binary compatibility becomes almost unattainable.

But there are many other reasons for using separate JAR archives:

JAR is only a way of packing.

This is an assembly artifact that includes classes, but this is not all: resources, as a rule, are also included in the archive. Packaging, like resource processing, has its price. The Gradle team aims to improve the build quality and reduce the developer's time to wait for the results of the compilation, tests and the build process in general. If the archive appears in the process too early, an unnecessary synchronization point is created. For example, to compile API-dependent classes, the only thing needed is .class files. No jar archives, no resources in jar. Similarly, to run Gradle tests, you need only class files and resources. For testing, there is no need to create a jar. He will need only the external user (that is, when publishing). But if creating an artifact becomes mandatory, Some tasks cannot be executed in parallel and the whole build process is inhibited. If for small projects this is not critical, for large-scale corporate projects this is the main slowing factor.

more importantly, being an artifact, the jar-archive cannot carry dependency information.

The dependencies of each class in Java 9 and Java 8 runtime can hardly be the same. Yes, in our simple example it will be like this, but for larger projects this is not true: the user usually imports the library's backport for Java 9 functionality and uses it to implement the version of the Java 8 class. However, if both versions are packaged in one archive, in one artifact there will be elements with different dependency trees. This means that if you are working with Java 9, you have dependencies that you never need. Moreover, it pollutes the classpath, creating probable conflicts for library users.

Finally, in one project, you can create JARs for different purposes:

for API
for java 8
for java 9
with native binding
etc.

Incorrect use of classifier dependencies leads to conflicts associated with the sharing of the same mechanism. Usually sources or javadocs are installed as classifiers, but in fact they have no dependencies.

We do not want to generate inconsistencies, the build process should not depend on how you get the classes. In other words, using multi-release jars has a side effect: a call from the JAR archive and a call from the class directory are now completely different things. They have a huge difference in semantics!
Depending on what tool you use to create a JAR, you may end up with incompatible JAR archives! The only tool that ensures that when you pack two class options into one archive, they have a single open API, this is the jar utility itself . Which, not without reasons, not necessarily involve assembly tools or even users. JAR is, in fact, an “envelope” that resembles a ZIP archive. So depending on how you collect it, you will get different behavior, and maybe you will collect an incorrect artifact (and you will not notice).

More efficient ways to manage individual JAR archives

The main reason that developers do not use separate archives is that they are inconvenient to collect and use. The assembly tools are to blame, which, before the appearance of the Gradle, could not cope with this at all. In particular, those who used this method in Maven could only rely on the weak classifier function to publish additional artifacts. However, classifier does not help in this difficult situation. They are used for various purposes, from the publication of source codes, documentation, javadocs, to the implementation of library options (guava-jdk5, guava-jdk7, ...) or various use cases (api, fat jar, ...). In practice, there is no way to show that the classifier dependency treedifferent from the dependency tree of the main project. In other words, the POM format is fundamentally broken, because represents both the way the component is assembled and the artifacts it delivers. Suppose you need to implement 2 different jar archives: a classic and a fat jar, including all dependencies. Maven decides that 2 artifacts have identical dependency trees, even if this is obviously wrong! In this case, it is more than obvious, but the situation is the same as with multi-release JARs!

The solution is to handle the options correctly. It is able to do Gradle, managing dependencies with the given options. At the moment, this feature is available for development on Android, but we are also working on its version for Java and native applications!

Dependency management based on options is based on the fact that modules and artifacts are completely different things. The same code can work fine in different runtimes, taking into account different requirements. For those who work with native compilation, this is obvious for a long time: we compile for i386 and amd64 and cannot in any way interfere with the dependence of the i386 library with arm64! In the context of Java, this means that for Java 8 you need to create a version of the “java 8” JAR archive, where the class format will correspond to Java 8. This artifact will contain metadata with information about which dependencies to use. For Java 8 or 9, dependencies corresponding to the version will be selected. Just like that (in fact, the reason is not that runtime is only one field of options, you can combine several).

Of course, no one has done this before because of excessive complexity: Maven, apparently, will never allow to turn such a complicated operation. But with gradle it is possible. Now the Gradle team is working on a new metadata format that tells users which option to use. Simply put, the build tool must deal with the compilation, testing, packaging, and processing of such modules. For example, the project should work in Java 8 and Java 9 runtimes. Ideally, you should implement 2 versions of the library. So, there are 2 different compilers (to avoid using Java 9 API when working in Java 8), 2 class directories and, finally, 2 different JAR archives. And still, most likely, it will be necessary to test 2 runtimes. Or do you implement 2 archives,

So far this scheme has not been implemented, but the Gradle team has made significant progress in this direction.

How to create a multi-release JAR using Gradle

But if this function is not ready yet, what should I do? Relax, correct artifacts are created the same way. Before the appearance of the above function in the Java ecosystem, there are two options:

the good old way using reflection or different JAR archives;
use multi-release JARs (note that this can be a bad decision, even with good usage examples).

No matter what you choose, different archives or multi-release JARs, the scheme will be the same. Multi-release JARs are essentially the wrong packaging: they should be an option, but not a goal. Technically, the source layout for the individual and external JARs is the same. This repository describes how to create a multi-release JAR using Gradle. The essence of the process is briefly described below.

First of all, you should always keep in mind one bad habit of developers: they launch Gradle (or Maven) using the same version of Java on which they plan to launch artifacts. Moreover, a later version is sometimes used to run Gradle, and compilation takes place with an earlier API level. But there is no particular reason to do so. In Gradle is possible to Ross compilation . It allows you to describe the position of the JDK, as well as launch a compilation by a separate process in order to compile a component using this JDK. The best way to configure various JDKs is to set up the path to the JDK via environment variables, as done in this file . Then you only need to configure Gradle to use the desired JDK, based on compatibility with the source / target platform.. It is worth noting that, starting with JDK 9, previous versions of the JDK are not needed for cross-compiling. This makes a new feature, -release. Gradle uses this function and sets up the compiler as needed.

Another key point is the source set designation . Source set is a set of source files that need to be compiled together. A JAR is obtained by compiling one or more source sets. For each set, Gradle automatically creates a corresponding custom compilation task. This means that if there are sources for Java 8 and Java 9, these sources will be in different sets. This is exactly how everything is arranged in the source code for Java 9 , which will include the version of our class. It really works, and you do not need to create a separate project, as in Maven. But most importantly, this method allows you to fine-tune the compilation of the set.

One of the difficulties of having different versions of the same class is that the class code is rarely independent of the rest of the code (it has dependencies with classes that are not in the main set). For example, its API can use classes that don’t need special sources for Java 9 support. At the same time, I don’t want to recompile all these common classes and package their versions for Java 9. They are general classes, so they must exist separately from classes for a particular JDK. Customize it here : add a dependency between the source set for Java 9 and the main set so that when compiling a version for Java 9 all common classes remain in the compilation classpath.

The next step is simple : you need to explain Gradle that the main source set will be compiled with the Java 8 API level, and the Java 9 set with the Java 9 level.

All of the above will help you when using both of the previously mentioned approaches: the implementation of separate JAR archives or a multi-release JAR. Since this is a post on this topic, let's look at an example of how to get Gradle to build a multi-release JAR:

jar {
  into('META-INF/versions/9') {
     from sourceSets.java9.output
  }
  manifest.attributes(
     'Multi-Release': 'true'
  )
}

This block describes: wrapping classes for Java 9 into the META-INF / versions / 9 directory , which is used for MR JAR, and setting a multi-release tag in the manifest.

And everything, your first MR JAR is ready!

But, unfortunately, the work is not over. If you have worked with the Gradle, you know that the application of the plugin application , you can run the application directly from the task of the run . However, due to the fact that usually Gradle tries to reduce the amount of work to a minimum, the run task must use both class directories and directories of processed resources. For multi-release JARs, this is a problem, because JAR is needed immediately! Therefore, instead of using the plugin, you will have to create your task , and this is an argument against the use of multi-release JARs.

Last but not least, we mentioned that we would need to test 2 versions of the class. To do this, you can use only VMs in a separate process, because there is no equivalent of a -release marker for Java runtime. The idea is that you only need to write one test, but it will be executed twice: in Java 8 and Java 9. This is the only way to make sure that runtime-specific classes work correctly. By default, Gradle creates one testing task, and it also uses class directories instead of JAR. Therefore, we will do two things: create a testing task for Java 9 and configure both tasks so that they use the JAR and the specified Java runtime. The implementation will look like this:

test {
   dependsOn jar
   def jdkHome = System.getenv("JAVA_8")
   classpath = files(jar.archivePath, classpath) - sourceSets.main.output
   executable = file("$jdkHome/bin/java")
   doFirst {
       println "$name runs test using JDK 8"
   }
}
task testJava9(type: Test){
   dependsOn jar
   def jdkHome = System.getenv("JAVA_9")
   classpath = files(jar.archivePath, classpath) - sourceSets.main.output
   executable = file("$jdkHome/bin/java")
   doFirst {
       println classpath.asPath
       println "$name runs test using JDK 9"
   }
}
check.dependsOn(testJava9)

Now when you run the check task, Gradle will compile each set of sources using the correct JDK, create a multi-release JAR, then run the tests with this JAR on both JDKs. Future versions of Gradle will help you do this more declaratively.

Conclusion

Let's sum up. You have learned that multi-release JARs are an attempt to solve a real problem faced by many library developers. However, this solution looks wrong. Correct management of dependencies, binding artifacts and variants, care for performance (the ability to perform as many tasks as possible in parallel) - all this makes MR JAR a solution for the poor. This problem can be solved correctly with options. And yet, while dependency management with options from Gradle is under development, multi-release JARs are quite convenient in simple cases. In this case, this post will help you understand how to do this, and how Gradle's philosophy differs from Maven (source set vs project).

Finally, we do not deny that there are cases in which multi-release JARs make sense: for example, when it is not known in what environment the application will be executed (not the library), but this is rather the exception. In this post, we described the main problems faced by library developers, and how multi-release JARs try to solve them. Proper modeling of dependencies as options improves performance (through fine-grained parallelism) and reduces maintenance costs (avoiding unforeseen complexity) compared to multi-release JARs. In your situation, MR JARs may also be needed, so Gradle has already taken care of this. Take a look at this sample project and try it yourself.

Tags: