How to worsen performance, improving it

    We wanted the best, but it turned out as always.
    Viktor Chernomyrdin,
    Russian statesman


    There are times in life when you seem to be doing everything right, but something goes wrong.
    This story is about one of these cases.


    Once I looked at this code and thought about speeding it up:


    public String appendBounds(Data data){
      int beginIndex = data.beginIndex;
      int endIndex = data.endIndex;
      returnnew StringBuilder()
              .append('L')
              .append(data.str, beginIndex, endIndex)
              .append(';')
              .toString();
    }

    First, I wanted to calculate the final length of the string, use variables beginIndexand endIndex(and the fact that, in addition to the trimmed string StringBuilder, 2 more signs will be added to the y), and pass this value to the constructor StringBuilderto immediately select an array of the required size. This thought seemed too obvious to me, so I decided to try something else. By the right thought, I was pushed by the fact that this code was not highlighted in any way by the “Idea”, although this clever girl usually suggests replacing a short chain with StringBuilder::appendthe addition of lines, which is shorter and easier to perceive.


    A hindrance to this simplification is the use of the method StringBuilder.append(CharSequence, int, int). Given that the field data.stris a string, String.substring(beginIndex, endIndex)you can use it to select a substring from it and pass it to the method StringBuilder.append(String).


    Code after conversion:


    public String appendBounds(Data data){
      int beginIndex = data.beginIndex;
      int endIndex = data.endIndex;
      String subString = data.str.substring(beginIndex, endIndex);
      returnnew StringBuilder()
              .append('L')
              .append(subString)
              .append(';')
              .toString();
    }

    And now the “Idea” offers a simplification:


    public String appendBounds(Data data){
      int beginIndex = data.beginIndex;
      int endIndex = data.endIndex;
      return'L' + data.str.substring(beginIndex, endIndex) + ';';
    }

    However, our goal in this case is not so much readability as performance. Compare both ways:


    @BenchmarkMode(Mode.AverageTime)
    @OutputTimeUnit(TimeUnit.NANOSECONDS)
    @Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g"})
    publicclassStringBuilderAppendBenchmark{
      @Benchmarkpublic String appendSubString(Data data){
        String latinStr = data.latinStr;
        String nonLatinStr = data.nonLatinStr;
        int beginIndex = data.beginIndex;
        int endIndex = data.endIndex;
        String substring = data.nonLatin ?
                nonLatinStr.substring(beginIndex, endIndex) :
                latinStr.substring(beginIndex, endIndex);
        returnnew StringBuilder()
                .append('L')
                .append(substring)
                .append(';')
                .toString();
      }
      @Benchmarkpublic String appendBounds(Data data){
        String latinStr = data.latinStr;
        String nonLatinStr = data.nonLatinStr;
        int beginIndex = data.beginIndex;
        int endIndex = data.endIndex;
        String appended = data.nonLatin ? nonLatinStr : latinStr;
        returnnew StringBuilder()
                .append('L')
                .append(appended, beginIndex, endIndex)
                .append(';')
                .toString();
      }
      @State(Scope.Thread)
      publicstaticclassData{
        String latinStr;
        String nonLatinStr;
        @Param({"true", "false"})
        boolean nonLatin;
        @Param({"5", "10", "50", "100", "500", "1000"})
        privateint length;
        privateint beginIndex;
        privateint endIndex;
        private ThreadLocalRandom random = ThreadLocalRandom.current();
        @Setuppublicvoidsetup(){
          latinStr = randomString("abcdefghijklmnopqrstuvwxyz");
          nonLatinStr = randomString("абвгдеёжзиклмнопрстуфхцчшщьыъэюя");
          beginIndex = 1;
          endIndex = length + 1;
        }
        private String randomString(String alphabet){
          char[] chars = alphabet.toCharArray();
          StringBuilder sb = new StringBuilder(length + 2);
          for (int i = 0; i < length + 2; i++) {
            char c = chars[random.nextInt(chars.length)];
            sb.append(c);
          }
          return sb.toString();
        }
      }
    }

    The benchmark is simple as two kopecks: a StringBuilderrandom string is added to -y, the size of which is determined by the field length, and since it is 2019 in the yard, you need to check as a string containing only the characters of the main Latin (so-called compressed string in which each character corresponds to 1 byte), and a string with non-Latin characters (each character is represented by 2 bytes).


    When we look at the surface, the method appendSubStringseems to be slower, because the amount of data to be pasted is the same as in the method appendBounds, but the method also appendSubStringhas an explicit creation of a substring, i.e., allocating memory to a new object and copying contents from it into data.latinStr/ data.nonLatinStr.


    All the more surprising (but only at first glance) are the results of the measurement that I performed using JDK11 on my home machine (Intel Core i5-4690, 3.50 GHz):


    Benchmark                           nonLatin   length     Score   Error   Units
    appendBounds                            true        5      44,6 ±   0,4   ns/op
    appendBounds                            true       10      45,7 ±   0,7   ns/op
    appendBounds                            true       50     129,0 ±   0,5   ns/op
    appendBounds                            true      100     218,7 ±   0,8   ns/op
    appendBounds                            true      500     907,1 ±   5,5   ns/op
    appendBounds                            true     1000    1626,4 ±  13,0   ns/op
    appendSubString                         true        5      28,6 ±   0,2   ns/op
    appendSubString                         true       10      30,8 ±   0,2   ns/op
    appendSubString                         true       50      65,6 ±   0,4   ns/op
    appendSubString                         true      100     106,6 ±   0,6   ns/op
    appendSubString                         true      500     430,1 ±   2,4   ns/op
    appendSubString                         true     1000     839,1 ±   8,6   ns/op
    appendBounds:·gc.alloc.rate.norm        true        5     184,0 ±   0,0    B/op
    appendBounds:·gc.alloc.rate.norm        true       10     200,0 ±   0,0    B/op
    appendBounds:·gc.alloc.rate.norm        true       50     688,0 ±   0,0    B/op
    appendBounds:·gc.alloc.rate.norm        true      100    1192,0 ±   0,0    B/op
    appendBounds:·gc.alloc.rate.norm        true      500    5192,0 ±   0,0    B/op
    appendBounds:·gc.alloc.rate.norm        true     1000   10200,0 ±   0,0    B/op
    appendSubString:·gc.alloc.rate.norm     true        5     136,0 ±   0,0    B/op
    appendSubString:·gc.alloc.rate.norm     true       10     160,0 ±   0,0    B/op
    appendSubString:·gc.alloc.rate.norm     true       50     360,0 ±   0,0    B/op
    appendSubString:·gc.alloc.rate.norm     true      100     608,0 ±   0,0    B/op
    appendSubString:·gc.alloc.rate.norm     true      500    2608,0 ±   0,0    B/op
    appendSubString:·gc.alloc.rate.norm     true     1000    5104,0 ±   0,0    B/op
    appendBounds                           false        5      20,8 ±   0,1   ns/op
    appendBounds                           false       10      24,0 ±   0,2   ns/op
    appendBounds                           false       50      66,4 ±   0,4   ns/op
    appendBounds                           false      100     111,0 ±   0,8   ns/op
    appendBounds                           false      500     419,2 ±   2,7   ns/op
    appendBounds                           false     1000     840,4 ±   7,8   ns/op
    appendSubString                        false        5      25,3 ±   0,3   ns/op
    appendSubString                        false       10      25,7 ±   0,2   ns/op
    appendSubString                        false       50      36,0 ±   0,1   ns/op
    appendSubString                        false      100      52,8 ±   0,4   ns/op
    appendSubString                        false      500     206,1 ±   6,1   ns/op
    appendSubString                        false     1000     388,1 ±   1,6   ns/op
    appendBounds:·gc.alloc.rate.norm       false        5      80,0 ±   0,0    B/op
    appendBounds:·gc.alloc.rate.norm       false       10      88,0 ±   0,0    B/op
    appendBounds:·gc.alloc.rate.norm       false       50     320,0 ±   0,0    B/op
    appendBounds:·gc.alloc.rate.norm       false      100     544,0 ±   0,0    B/op
    appendBounds:·gc.alloc.rate.norm       false      500    2144,0 ±   0,0    B/op
    appendBounds:·gc.alloc.rate.norm       false     1000    4152,0 ±   0,0    B/op
    appendSubString:·gc.alloc.rate.norm    false        5      96,0 ±   0,0    B/op
    appendSubString:·gc.alloc.rate.norm    false       10     112,0 ±   0,0    B/op
    appendSubString:·gc.alloc.rate.norm    false       50     192,0 ±   0,0    B/op
    appendSubString:·gc.alloc.rate.norm    false      100     288,0 ±   0,0    B/op
    appendSubString:·gc.alloc.rate.norm    false      500    1088,0 ±   0,0    B/op
    appendSubString:·gc.alloc.rate.norm    false     1000    2088,0 ±   0,0    B/op

    appendSubStringIn the overwhelming majority of cases (including always for non-Latin strings), the method that refuted our assumption turned out to be both faster and less voracious (and this despite the fact that it String::substringreturns a new object). How did it happen?


    I look in the book, I see a fig


    The source code StringBuilder-a will help lift the veil of secrecy . Both methods used pass the call to the same methods AbstractStringBuilder:


    publicfinalclassStringBuilderextendsAbstractStringBuilderimplementsjava.io.Serializable, Comparable<StringBuilder>, CharSequence{
      @Overridepublic StringBuilder append(String str){
        super.append(str);
        returnthis;
      }
      @Overridepublic StringBuilder append(CharSequence s, int start, int end){
        super.append(s, start, end);
        returnthis;
      }
    }

    Go to AbstractStringBuilder.append(String):


    public AbstractStringBuilder append(String str){
      if (str == null) {
        return appendNull();
      }
      int len = str.length();
      ensureCapacityInternal(count + len);
      putStringAt(count, str);
      count += len;
      returnthis;
    }
    privatefinalvoidputStringAt(int index, String str){
      if (getCoder() != str.coder()) {
        inflate();
      }
      str.getBytes(value, index, coder);
    }

    What is interesting here? The method AbstractStringBuilder::inflate, as the name suggests, expands the array AbstractStringBuilder.valuewhen combining dissimilar strings. Moving the same data occurs in the method String::getBytes:


    voidgetBytes(byte[] dst, int dstBegin, byte coder){
      if (coder() == coder) {
        System.arraycopy(value, 0, dst, dstBegin << coder, value.length);
      } else {    // this.coder == LATIN && coder == UTF16
        StringLatin1.inflate(value, 0, dst, dstBegin, value.length);
      }
    }

    What is important? If the strings are homogeneous, then data movement is intrinsic System::arraycopy, in the opposite case StringLatin1::inflate, which leads us to the method through delegation StringUTF16::inflate:


    // inflatedCopy byte[] -> byte[]@HotSpotIntrinsicCandidatepublicstaticvoidinflate(byte[] src, int srcOff, byte[] dst, int dstOff, int len){
      // We need a range check here because 'putChar' has no checks
      checkBoundsOffCount(dstOff, len, dst);
      for (int i = 0; i < len; i++) {
        putChar(dst, dstOff++, src[srcOff++] & 0xff);
      }
    }
    @HotSpotIntrinsicCandidatestaticvoidputChar(byte[] val, int index, int c){
      assert index >= 0 && index < length(val) : "Trusted caller missed bounds check";
      index <<= 1;
      val[index++] = (byte)(c >> HI_BYTE_SHIFT);
      val[index]   = (byte)(c >> LO_BYTE_SHIFT);
    }

    Thus, if the strings are homogeneous, then the platform-dependent method is used to move the data System::arraycopy, otherwise - the cycle (also intrinsicized). This means that when pasting two strings, where all characters belong to the main Latin alphabet (to put it simply, they fit into 1 byte), the performance should be much better than when sticking together heterogeneous strings. The benchmark confirms this (see conclusion for nonLatin = false).


    Now the method is AbstractStringBuilder.append(CharSequence, int, int):


    @Overridepublic AbstractStringBuilder append(CharSequence s, int start, int end){
      if (s == null) {
        s = "null";
      }
      checkRange(start, end, s.length());
      int len = end - start;
      ensureCapacityInternal(count + len);
      appendChars(s, start, end);
      returnthis;
    }
    privatefinalvoidappendChars(CharSequence s, int off, int end){
      if (isLatin1()) {
        byte[] val = this.value;
        for (int i = off, j = count; i < end; i++) {
          char c = s.charAt(i);
          if (StringLatin1.canEncode(c)) {
            val[j++] = (byte)c;
          } else {
            count = j;
            inflate();
            StringUTF16.putCharsSB(this.value, j, s, i, end);
            count += end - i;
            return;
          }
        }
      } else {
        StringUTF16.putCharsSB(this.value, count, s, off, end);
      }
      count += end - off;
    }

    Here, the approach is similar to that existing in the previous example: for homogeneous strings a simpler mechanism is used (here, cognitive copying in a loop), for heterogeneous strings — a call to StringUTF16, but note that the method being called is StringUTF16::putCharsSBnot intrinsic.


    publicstaticvoidputCharsSB(byte[] val, int index, CharSequence s, int off, int end){
        checkBoundsBeginEnd(index, index + end - off, val);
        for (int i = off; i < end; i++) {
            putChar(val, index++, s.charAt(i));
        }
    }

    So, the internal structure of both methods and the reason for their different performance are more or less clear to us. The question naturally arises - what to do with the knowledge gained next? There are several options at once:


    1) keep it in your head and, when a suspicious code is detected, change it with your hands
    2) go to Tagir and ask him to sip a check that will do the work for us
    3) make changes to the JDK so that the code does not change at all.


    Of course, we start with the third. Ready to take a chance?


    Dive into the abyss


    We will train on catson the source code of the eleventh Java, you can download it here .


    The simplest and most obvious way to improve is to select the substring already inside the method AbstractStringBuilder.append(CharSequence, int, int):


    // былоpublic AbstractStringBuilder append(CharSequence s, int start, int end){
      if (s == null) {
        s = "null";
      }
      checkRange(start, end, s.length());
      int len = end - start;
      ensureCapacityInternal(count + len);
      appendChars(s, start, end);
      returnthis;
    }
    // сталоpublic AbstractStringBuilder append(CharSequence s, int start, int end){
        if (s == null) {
            s = "null";
        }
        checkRange(start, end, s.length());
        return append(s.subSequence(start, end).toString());
    }

    Now you need to build a JDK, run the tests and run a benchmark on it StringBuilderAppendBenchmark::appendBounds, the results of which need to be compared with the results of the same benchmark on the original JDK:


    # здесь колонка before содержит данные прогона на исходном JDK,
    # after - на изменённом
    Benchmark           nonLatin  length     before      after    Units
    avgt                    true       5       44,6       64,4    ns/op
    avgt                    true      10       45,7       66,3    ns/op
    avgt                    true      50      129,0      168,9    ns/op
    avgt                    true     100      218,7      281,9    ns/op
    avgt                    true     500      907,1     1116,2    ns/op
    avgt                    true    1000     1626,4     2002,5    ns/op
    gc.alloc.rate.norm      true       5      184,0      264,0     B/op
    gc.alloc.rate.norm      true      10      200,0      296,0     B/op
    gc.alloc.rate.norm      true      50      688,0      904,0     B/op
    gc.alloc.rate.norm      true     100     1192,0     1552,0     B/op
    gc.alloc.rate.norm      true     500     5192,0     6752,0     B/op
    gc.alloc.rate.norm      true    1000    10200,0    13256,0     B/op
    avgt                   false       5       20,8       38,0    ns/op
    avgt                   false      10       24,0       37,8    ns/op
    avgt                   false      50       66,4       82,9    ns/op
    avgt                   false     100      111,0      138,8    ns/op
    avgt                   false     500      419,2      531,9    ns/op
    avgt                   false    1000      840,4     1002,7    ns/op
    gc.alloc.rate.norm     false       5       80,0      152,0     B/op
    gc.alloc.rate.norm     false      10       88,0      168,0     B/op
    gc.alloc.rate.norm     false      50      320,0      440,0     B/op
    gc.alloc.rate.norm     false     100      544,0      688,0     B/op
    gc.alloc.rate.norm     false     500     2144,0     2688,0     B/op
    gc.alloc.rate.norm     false    1000     4152,0     5192,0     B/op

    What is called, suddenly! Improvements not only did not happen, but worsened. Damn it, but how?


    The fact is that at the very beginning, in the description of the method, StringBuilder::appendI made one small but critical omission. The method was described like this:


    publicfinalclassStringBuilder{
      @Overridepublic StringBuilder append(String str){
        super.append(str);
        returnthis;
      }
    }

    And here is its full look:


    publicfinalclassStringBuilder{
      @Override@HotSpotIntrinsicCandidatepublic StringBuilder append(String str){
        super.append(str);
        returnthis;
      }
    }

    The Java code, which we considered above, being warmed up and compiled at the C2 level, does not matter, since it is not performed, but the intrinsic. This is easily proved by removing the profile using the async-profiler . Hereinafter, the profile is removed for length = 1000and nonLatin = true:


    # профиль метода `appendSubString`, JDK без наших изменений
              ns  percent  samples  top
      ----------  -------  -------  ---
     19096340914   43.57%  1897673  jbyte_disjoint_arraycopy   <---------
     13500185356   30.80%  1343343  jshort_disjoint_arraycopy  <---------
      4124818581    9.41%   409533  java.lang.String.<init>           # создание подстроки
      2177311938    4.97%   216375  java.lang.StringUTF16.compress    # создание подстроки
      1557269661    3.55%   154253  java.util.Arrays.copyOfRange      # создание подстроки
       349344451    0.80%    34823  appendSubString_avgt_jmhStub
       279803769    0.64%    27862  java.lang.StringUTF16.newString
       274388920    0.63%    27312  org.openjdk.jmh.infra.Blackhole.consume
       160962540    0.37%    15946  SpinPause
       122418222    0.28%    11795  __memset_avx2

    The code StringBuilder-a (yes and AbstractStringBuilder-a) does not even smell here, almost 3/4 of the profile is occupied by an intrinsic. I would like to see approximately the same picture in the "improved" profile by us StringBuilder.append(CharSequence, int, int).


    In fact, we have this:


              ns  percent  samples  top
      ----------  -------  -------  ---
     19071221451   43.78%  1897827  jbyte_disjoint_arraycopy
      6409223440   14.71%   638348  jlong_disjoint_arraycopy
      3933622128    9.03%   387403  java.lang.StringUTF16.newBytesFor
      2067248311    4.75%   204193  java.lang.AbstractStringBuilder.ensureCapacityInternal
      1929218737    4.43%   194751  java.lang.StringUTF16.compress
      1678321343    3.85%   166458  java.util.Arrays.copyOfRange
      1621470408    3.72%   160849  java.lang.String.checkIndex
       969180099    2.22%    96018  java.util.Arrays.copyOf
       581600786    1.34%    57818  java.lang.AbstractStringBuilder.<init>
       417818533    0.96%    41611  appendBounds_jmhTest
       406565329    0.93%    40479  java.lang.String.<init>
       340972882    0.78%    33727  java.lang.AbstractStringBuilder.append
       299895915    0.69%    29982  java.lang.StringBuilder.toString
       183885595    0.42%    18136  SpinPause
       168666033    0.39%    16755  org.openjdk.jmh.infra.Blackhole.consume

    You say: "Here they are, intrinsiki, at the very top!" Indeed, only this is a little wrong intrinsiki (incl. Compare the name of the second from above). Recall:


    publicfinalclassStringBuilder{
      @Override@HotSpotIntrinsicCandidatepublic StringBuilder append(String str){
        super.append(str);
        returnthis;
      }
    }

    Here the intrinsic replaces the challenge StringBuilder.append(String), but in our patch there is no such challenge! Called AbstractStringBuilder.append(String). The challenge we jbyte_disjoint_arraycopyare seeing is an intrinsic for StringLatin1::inflatecalled from AbstractStringBuider::putStringAtthrough String::getBytes. That is, in contrast to StringBuilder::appendworking out not only platform-dependent, but also Java code,


    With the cause of the failure figured out, try to succeed otherwise. It is easy to guess that we need to somehow turn specifically to StringBuilder::append. You can do this by tearing off the previous patch and making changes to yourself StringBuilder:


    publicfinalclassStringBuilder{
      // было@Overridepublic StringBuilder append(CharSequence s, int start, int end){
        super.append(s, start, end);
        returnthis;
      }
      // стало@Overridepublic StringBuilder append(CharSequence s, int start, int end){
        if (s == null) {
          s = "null";
        }
        checkRange(start, end, s.length());
        returnthis.append(s.subSequence(start, end).toString());
      }
    }

    Now everything is done according to the mind: an intrinsic StringBuilder :: append is called.
    Reassemble, run, compare:


    # здесь колонка before содержит данные прогона на исходном JDK,
    # after - на изменённом
    Benchmark             nonLatin   length    before     after   Units
    avgt                      true        5      44,6      60,2   ns/op
    avgt                      true       10      45,7      59,1   ns/op
    avgt                      true       50     129,0     164,6   ns/op
    avgt                      true      100     218,7     276,2   ns/op
    avgt                      true      500     907,1    1088,8   ns/op
    avgt                      true     1000    1626,4    1959,4   ns/op
    gc.alloc.rate.norm        true        5     184,0     264,0    B/op
    gc.alloc.rate.norm        true       10     200,0     296,0    B/op
    gc.alloc.rate.norm        true       50     688,0     904,0    B/op
    gc.alloc.rate.norm        true      100    1192,0    1552,0    B/op
    gc.alloc.rate.norm        true      500    5192,0    6752,0    B/op
    gc.alloc.rate.norm        true     1000   10200,0   13256,0    B/op
    avgt                     false        5      20,8      37,9   ns/op
    avgt                     false       10      24,0      37,9   ns/op
    avgt                     false       50      66,4      80,9   ns/op
    avgt                     false      100     111,0     125,6   ns/op
    avgt                     false      500     419,2     483,6   ns/op
    avgt                     false     1000     840,4     893,8   ns/op
    gc.alloc.rate.norm       false        5      80,0     152,0    B/op
    gc.alloc.rate.norm       false       10      88,0     168,0    B/op
    gc.alloc.rate.norm       false       50     320,0     440,0    B/op
    gc.alloc.rate.norm       false      100     544,0     688,0    B/op
    gc.alloc.rate.norm       false      500    2144,0    2688,0    B/op
    gc.alloc.rate.norm       false     1000    4152,0    5187,2    B/op
    

    I really am very sad, but not better. Now the new profile:


              ns  percent  samples  top
      ----------  -------  -------  ---
     19614374885   44.12%  1953620  jbyte_disjoint_arraycopy
      6645299702   14.95%   662146  jlong_disjoint_arraycopy
      4065789919    9.15%   400167  java.lang.StringUTF16.newBytesFor
      2374627822    5.34%   234746  java.lang.AbstractStringBuilder.ensureCapacityInternal
      1837858014    4.13%   183822  java.lang.StringUTF16.compress
      1472039604    3.31%   145956  java.util.Arrays.copyOfRange
      1316397864    2.96%   130747  appendBounds_jmhTest
       956823151    2.15%    94959  java.util.Arrays.copyOf
       573091712    1.29%    56933  java.lang.AbstractStringBuilder.<init>
       434454076    0.98%    43202  java.lang.String.<init>
       368480388    0.83%    36439  java.lang.AbstractStringBuilder.append
       304409057    0.68%    30442  java.lang.StringBuilder.toString
       272437989    0.61%    26833  SpinPause
       201051696    0.45%    19985  java.lang.StringBuilder.<init>
       198934052    0.45%    19810  appendBounds_avgt_jmhStub

    Little has changed. For me, it remains unclear why the intrinsic did not work when accessing StringBuilder.append(String)from the inside StringBuilder. There is a suspicion that the gluing (inline) of the method StringBuilder.append(String)body into the body StringBuilder.append(CharSequence, int, int)changes something in the processing of VM calls to methods.


    Anyway, it's a fiasco, bro. It was not possible to patch the JDK, but we can still do the manual replacement where it makes sense.


    Literary digression about failures
    Ответная шифровка пришла через два дня. Навигатору совсем не хочется расставаться с «Ото Велара», с фирмой, которая строит удивительно быстрые и мощные военные корабли. Навигатору не хочется читать шифровку мне. Он просто повторяет ответ командного пункта: «Нет». Шифровка не разъясняет, почему «нет». «Нет» в любом случае означает, что он –личность известная большому компьютеру. Если бы о нем ничего не было известно, то ответ был бы положительным: пробуйте. Жаль. Жаль такого, интересного человека терять. А командиру, наверное, жаль меня. Может быть, первый раз жаль. Он видит, что я рвусь в варяги. И ему совсем не хочется вновь толкать меня в борзые.
    Он молчит. Но я-то знаю, что в обеспечении дикая нехватка рабочих рук:
    — Я, товарищ генерал, завтра в обеспечении работаю. Разрешите идти?
    — Иди. — И вдруг улыбается. — Ты знаешь, нет худа без добра.
    — У меня, товарищ генерал, всегда худо без добра.
    — А вот и нет. Тебе запретили его встречать, это плохо. Но к сокровищам нашего опыта мы добавили еще одну крупицу.

    Findings:


    • The code of JDK methods in some cases is not related to the present execution, because instead of the method body, an intrinsic can be executed that is hidden in the bowels of the VM.
    • such methods can be recognized, in particular, they are indicated by a label @HotSpotIntrinsicCandidate, although some methods are intrinsicized without any hints, for example String::equals(and many, many others ).
    • The conclusion from the first two is that our reasoning about how JDK code works may contradict reality. C'est la vie

    PS
    Other possible replacement:


    StringBuilder sb = new StringBuilder();
    sb.append(str, 0, endIndex);
    // -->
    StringBuilder sb = new StringBuilder(str.substring(o, endIndex));

    PPS
    Oracle developers rightly point out that


    It makes it possible for me to make it so
    that it makes it possible for me to
    run it. As you observed, the performance
    tradeoffs are not obvious.

    It is possible to make it possible to optimize sb.append (cs, int, int)
    .

    The proposed solution is intrinsification StringBuilder.append(CharSequence, int, int).


    Task
    Discussion


    PPS
    Interestingly, at the moment when writing something like


    StringBuilder sb = new StringBuilder();
    sb.append(str.substring(0, endIndex));

    "Idea" offers to simplify the code to


    StringBuilder sb = new StringBuilder();
    sb.append(s, 0, endIndex);

    If the performance in this place is not very important to you, then it is probably more correct to use the second, simplified version. Still, most of the code we write for our comrades, not for machines.


    Also popular now: