Compact strings in Java 9

Original author: Mahmoud Anouti
  • Transfer
Hello again! We opened the next set in the now ninth group , the Java Developer group (and the tenth group in the plans, suddenly, is on December 31) and prepared interesting materials and an open lesson for you .

So let's go.

Want to reduce the amount of memory used by your Java application? See how you can improve performance with the compact strings available in Java 9.

One of the performance improvements introduced in the JVM (Oracle HotSpot, to be precise), as part of Java SE 9, turned out to be compact strings. Their task is to reduce the size of String objects, which allows to reduce the total amount (footprint) of memory consumed by the application. As a result, this can reduce the amount of time spent collecting garbage.

The function is based on the observation that many String objects do not need 2 bytes to encode each character, since most applications use only Latin-1 characters. Therefore, instead of this:

/** это значение используется для хранения символа */
private final char value[];

As java.lang.Stringnow there is:

private final byte[] value;
 * идентификатор кодировки используется для кодирования байтов в 
 * {@code value}. В этой имплементации поддерживаются следующие значения: 
 * UTF16
 * @implNote Виртуальная машина доверяет этому полю. Оно подлежит постоянному
 * “сворачиванию”, если инстанс String - константа. Перезапись этого 
 * поля после конструирования может вызвать проблемы. 
private final byte coder;

In other words, this function replaces the value in the array char(where each element uses 2 bytes) with a byte array with an additional byte to determine the encoding (Latin-1 or UTF-16). This means that in most applications using only Latin-1 characters, only half the heap will be used. The user will not notice the differences, but the associated APIs, for example, StringBuilderwill automatically take advantage of this.

To show this change in terms of the size of the String object, I will use Java Object Layout, a simple utility for visualizing the structure of the object on the heap. From this point of view, we are interested in the footprint of the array (stored in the value variable above), and not just a reference (a byte array reference, like a character array reference, uses 4 bytes). The code below displays information using JOLGraphLayout:

public class JOLSample {
    public static void main(String[] args) {

Running the code above in Java 8 and then in Java 9 shows the difference:

$java -version
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
$java -cp lib\jol-cli-0.9-full.jar;. test.JOLSample
java.lang.String@4554617cd footprint:
     COUNT       AVG       SUM   DESCRIPTION
         1       432       432   [C
         1        24        24   java.lang.String
         2                 456   (total)
$java -version
java version "9"
Java(TM) SE Runtime Environment (build 9+181)
Java HotSpot(TM) 64-Bit Server VM (build 9+181, mixed mode)
$java -cp lib\jol-cli-0.9-full.jar;. test.JOLSample
java.lang.String@73035e27d footprint:
     COUNT       AVG       SUM   DESCRIPTION
         1       224       224   [B
         1        24        24   java.lang.String
         2                 248   (total)

Ignoring the 24-byte size of the internal components of java.lang.String (header plus links), we see that, due to its compactness, the size has almost halved.
If we replace the line above with another one using UTF-16 characters, for example \ u0780, and then restart the code above, then both Java 8 and Java 9 will show the same footprint, since compactness will no longer be used.

This function can be disabled by passing a parameter to the -XX:-CompactStringscommand java.

As always, we are waiting for your comments and questions here, as well as invite you to an open lesson .

Also popular now: