Subtleties of the switch statement

  • Tutorial
Yes, this is a whole article on the most common switch in JDK 7. It happens that the accumulated material seems interesting and little known, and then it turns out that any grandmother at the entrance has been aware of the switch implementation features for 50 years. But I'll try. For starters, I suggest 3 questions:

  1. (Simple) What is the result of this code?
    default: System.out.print(0);
    case 1: System.out.print(1); break;
    case 4: System.out.print(4);
    case 2: System.out.print(2);

  2. The following 2 options are almost the same. A little different literals.
    //Вариант 1
    case "AaAaAa": break; 
    case "AaAaBB": break;
    case "AaBBAa": break;
    case "AaBBBB": break;
    case "BBAaAa": break;
    case "BBAaBB": break;
    case "BBBBAa": break;
    case "BBBBBB": break;
    //Вариант 2
    case "AaAaAa_1": break;
    case "AaAaBB_2": break;
    case "AaBBAa_3": break;
    case "AaBBBB_4": break;
    case "BBAaAa_5": break;
    case "BBAaBB_6": break;
    case "BBBBAa_7": break;
    case "BBBBBB_8": break;
    Why is the first switch running several times slower, at least with JIT disabled (-Djava.compiler = NONE)? Check yourself in a loop! You won’t be able to JIT with such code, but if you play a little, a small difference will be noticeable.
  3. What is the computational complexity of the algorithm for finding the same value among n cases (at least in JDK 7)?

  1. 01
  2. The hashCode () method returns the same value for all lines of the first switch. Details below.
  3. Depending on the case, it can be O (1), O (log n) and even reach O (n).

Let's get it right. Case values, for brevity, I will call keys.


Take an example from the first task. Compile it and disassemble the resulting bytecode.
javap -c Main.class

         0: iconst_5                          // Засовываем 5 в стек
         1: tableswitch   { // 1 to 4         // Забираем 3 из стека и ищем в таблице
                       1: 39                  // 39, 56, 32, 49 – адресные метки переходов
                       2: 56
                       3: 32
                       4: 49
                 default: 32
        32: getstatic     #27                 // Field java/lang/System.out:Ljava/io/PrintStream;
        35: iconst_0
        36: invokevirtual #33                 // Method java/io/PrintStream.print:(I)V
        39: getstatic     #27                 // Field java/lang/System.out:Ljava/io/PrintStream;
        42: iconst_1
        43: invokevirtual #33                 // Method java/io/PrintStream.print:(I)V
        46: goto          63                  // break
        49: getstatic     #27                 // Field java/lang/System.out:Ljava/io/PrintStream;
        52: iconst_4
        53: invokevirtual #33                 // Method java/io/PrintStream.print:(I)V
        56: getstatic     #27                 // Field java/lang/System.out:Ljava/io/PrintStream;
        59: iconst_2
        60: invokevirtual #33                 // Method java/io/PrintStream.print:(I)V
        63: return

We are particularly interested in the instruction labeled 1. It means that the compiler created a table (array) containing the transition addresses for all values ​​of the keys from the smallest to the largest, with no exceptions. The switch execution algorithm is something like this (pseudo-code):

if (valhigh){
     jump default;
     jump table[val - low];

In our case: low == 1, high == 4, table == {39, 56, 32, 49}. Since all keys from low to high should be in the table in sequence, the compiler had to add key 3 and set the same behavior for it as for default.

Starting with instruction 32, the code for all case and default is in the order they appear in the source code. Roughly speaking, here the compiler generates continuous code of key handlers. I think it’s now clear why, after the correspondence is found, execution continues until the first break occurs.


A reasonable question arises: what if the values ​​of the keys are very sparse? If we have only two of them: 1 and 1,000,000, then it is extremely unwise to create an array with a million elements. Replace the key 4 in 10 in our example, this will be enough (if suddenly not, increase it). We look at the bytecode (the bytecode of the handlers remained almost the same, therefore not shown):

         1: lookupswitch  { // 3
                       1: 43
                       2: 60
                      10: 53
                 default: 36

A table is also set here, but for pairs: the key is the transition label. The JVM specification says that although the search can be linear, the keys must be sorted to enable faster searches, although the search method itself is not specified. Perhaps some implementations use linear search. This is the first case I know (albeit theoretical) with the complexity of switch O (n). Next we will see another, more tangible.

Real boys and girls ask: how does the compiler decide what to choose - tableswitch or lookupswitch? And the most real ones download the OpenJDK sources (note that in other JDK implementations the method of choice may differ) and study the visitSwitch method of the class in openjdk / langtools / src / share / classes.

            // Determine whether to issue a tableswitch or a lookupswitch
            // instruction.
            long table_space_cost = 4 + ((long) hi - lo + 1); // words
            long table_time_cost = 3; // comparisons
            long lookup_space_cost = 3 + 2 * (long) nlabels;
            long lookup_time_cost = nlabels;
            int opcode =
                nlabels > 0 &&
                table_space_cost + 3 * table_time_cost <=
                lookup_space_cost + 3 * lookup_time_cost
                tableswitch : lookupswitch;

table_space_cost - this size includes the number of all values ​​in the range, plus one value for lo, hi, default_address and the marker of the selected switch method (tableswitch).
table_time_cost - 3 operations: checking the entry into the range (minimum and maximum) and extracting the address label from the table.
lookup_space_cost - 2 values ​​for each key-address pair, plus a value for the size of the table, default_address, and a marker for the selected switch method (lookupswitch).
lookup_time_cost - the worst case is assumed - a linear search.

And the algorithm itself, as you see, is simple. The magic number 3 (“And these people forbid us to pick our nose” (c)) is most likely empirical.

String comparison

“String.hashCode can easily have collisions, String.equals is too slow, maybe the strings are interned?” - So I thought, until I studied the bytecode.

case "AA": break;
case "BB": break;

       0: ldc           #27                 // String
       2: dup
       3: astore_0
       4: invokevirtual #29                 // Method java/lang/String.hashCode:()I
       7: lookupswitch  { // 2
                  2080: 32
                  2112: 44
               default: 73
      32: aload_0
      33: ldc           #35                 // String AA
      35: invokevirtual #37                 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
      38: ifne          56
      41: goto          73
      44: aload_0
      45: ldc           #41                 // String BB
      47: invokevirtual #37                 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
      50: ifne          66
      53: goto          73
      56: getstatic     #43                 // Field java/lang/System.out:Ljava/io/PrintStream;
      59: iconst_1
      60: invokevirtual #49                 // Method java/io/PrintStream.println:(I)V
      63: goto          73
      66: getstatic     #43                 // Field java/lang/System.out:Ljava/io/PrintStream;
      69: iconst_2
      70: invokevirtual #49                 // Method java/io/PrintStream.println:(I)V
      73: return

That is, at the compilation stage, the hash code of all keys is calculated. LookupSwitch is always executed for strings, even if hashes are consistent. At execution, the hash code of the conditional expression is calculated and it is it that is compared with the hash keys. If they match, the strings are also compared (addresses 32–53) using the String.equals () method to prevent possible collisions. And, if successful, the transition to the implementation of the corresponding expression (56–70).

And if we have several keys with the same hashes?
case "Aa": break;
case "BB": break;

        7: lookupswitch  { // 1
                   2112: 24
                default: 62
       24: aload_0
       25: ldc           #35                 // String Aa
       27: invokevirtual #37                 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
       30: ifne          45
       33: aload_0
       34: ldc           #41                 // String BB
       36: invokevirtual #37                 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
       39: ifne          55
       42: goto          62
       45: getstatic     #43                 // Field java/lang/System.out:Ljava/io/PrintStream;
       48: iconst_1
       49: invokevirtual #49                 // Method java/io/PrintStream.println:(I)V
       52: goto          62
       55: getstatic     #43                 // Field java/lang/System.out:Ljava/io/PrintStream;
       58: iconst_2
       59: invokevirtual #49                 // Method java/io/PrintStream.println:(I)V
       62: return

Then these keys are combined under one hash key in lookupswitch, and, if the key matches, iterates over all the lines with this hash and compares them using String.equals (). The example from the 2nd question performs as many as 8 comparisons. Here is the second case with complexity O (n).


If it weren’t for JIT, we could reason about switch optimization. But I was not able to find a good example in which tableswitch would be noticeably faster than lookupswitch (with JIT enabled). Well, perhaps this:
1. Suppose we have N keys with values ​​from 1 to N. In this case, tableswitch will be used.
2. The same keys, but plus one more, with a value much larger than N. In this case, lookupswitch will be used.
(With JIT turned off, everything is clear, the difference is palpable.)
There is no difference with JIT. Perhaps he breaks all the keys into several ranges and treats them like tableswitch. However, starting from N = 721, I experienced a sharp drop in the performance of the second example.

In conclusion, only very wild advice suggests itself, consider them a joke: “Guys, if you have 1000 cases with consecutive keys except a few in a cycle that should be executed a hundred million times per second, then process these few keys outside of the switch. And if in this loop a bunch of string keys with the same hashes, then think about other implementation methods. "

Also popular now: