
Subtleties of the switch statement
- Tutorial
Yes, this is a whole article on the most common switch in JDK 7. It happens that the accumulated material seems interesting and little known, and then it turns out that any grandmother at the entrance has been aware of the switch implementation features for 50 years. But I'll try. For starters, I suggest 3 questions:
Answers:
Let's get it right. Case values, for brevity, I will call keys.
Take an example from the first task. Compile it and disassemble the resulting bytecode.
We are particularly interested in the instruction labeled 1. It means that the compiler created a table (array) containing the transition addresses for all values of the keys from the smallest to the largest, with no exceptions. The switch execution algorithm is something like this (pseudo-code):
In our case: low == 1, high == 4, table == {39, 56, 32, 49}. Since all keys from low to high should be in the table in sequence, the compiler had to add key 3 and set the same behavior for it as for default.
Starting with instruction 32, the code for all case and default is in the order they appear in the source code. Roughly speaking, here the compiler generates continuous code of key handlers. I think it’s now clear why, after the correspondence is found, execution continues until the first break occurs.
A reasonable question arises: what if the values of the keys are very sparse? If we have only two of them: 1 and 1,000,000, then it is extremely unwise to create an array with a million elements. Replace the key 4 in 10 in our example, this will be enough (if suddenly not, increase it). We look at the bytecode (the bytecode of the handlers remained almost the same, therefore not shown):
A table is also set here, but for pairs: the key is the transition label. The JVM specification says that although the search can be linear, the keys must be sorted to enable faster searches, although the search method itself is not specified. Perhaps some implementations use linear search. This is the first case I know (albeit theoretical) with the complexity of switch O (n). Next we will see another, more tangible.
Real boys and girls ask: how does the compiler decide what to choose - tableswitch or lookupswitch? And the most real ones download the OpenJDK sources (note that in other JDK implementations the method of choice may differ) and study the visitSwitch method of the com.sun.tools.javac.jvm.Gen.java class in openjdk / langtools / src / share / classes.
table_space_cost - this size includes the number of all values in the range, plus one value for lo, hi, default_address and the marker of the selected switch method (tableswitch).
table_time_cost - 3 operations: checking the entry into the range (minimum and maximum) and extracting the address label from the table.
lookup_space_cost - 2 values for each key-address pair, plus a value for the size of the table, default_address, and a marker for the selected switch method (lookupswitch).
lookup_time_cost - the worst case is assumed - a linear search.
And the algorithm itself, as you see, is simple. The magic number 3 (“And these people forbid us to pick our nose” (c)) is most likely empirical.
“String.hashCode can easily have collisions, String.equals is too slow, maybe the strings are interned?” - So I thought, until I studied the bytecode.
That is, at the compilation stage, the hash code of all keys is calculated. LookupSwitch is always executed for strings, even if hashes are consistent. At execution, the hash code of the conditional expression is calculated and it is it that is compared with the hash keys. If they match, the strings are also compared (addresses 32–53) using the String.equals () method to prevent possible collisions. And, if successful, the transition to the implementation of the corresponding expression (56–70).
And if we have several keys with the same hashes?
Then these keys are combined under one hash key in lookupswitch, and, if the key matches, iterates over all the lines with this hash and compares them using String.equals (). The example from the 2nd question performs as many as 8 comparisons. Here is the second case with complexity O (n).
If it weren’t for JIT, we could reason about switch optimization. But I was not able to find a good example in which tableswitch would be noticeably faster than lookupswitch (with JIT enabled). Well, perhaps this:
1. Suppose we have N keys with values from 1 to N. In this case, tableswitch will be used.
2. The same keys, but plus one more, with a value much larger than N. In this case, lookupswitch will be used.
(With JIT turned off, everything is clear, the difference is palpable.)
There is no difference with JIT. Perhaps he breaks all the keys into several ranges and treats them like tableswitch. However, starting from N = 721, I experienced a sharp drop in the performance of the second example.
In conclusion, only very wild advice suggests itself, consider them a joke: “Guys, if you have 1000 cases with consecutive keys except a few in a cycle that should be executed a hundred million times per second, then process these few keys outside of the switch. And if in this loop a bunch of string keys with the same hashes, then think about other implementation methods. "
- (Simple) What is the result of this code?
switch(5){ default: System.out.print(0); case 1: System.out.print(1); break; case 4: System.out.print(4); case 2: System.out.print(2); }
- The following 2 options are almost the same. A little different literals.
Why is the first switch running several times slower, at least with JIT disabled//Вариант 1 switch("BBBBBB"){ case "AaAaAa": break; case "AaAaBB": break; case "AaBBAa": break; case "AaBBBB": break; case "BBAaAa": break; case "BBAaBB": break; case "BBBBAa": break; case "BBBBBB": break; }
//Вариант 2 switch("BBBBBB_8"){ case "AaAaAa_1": break; case "AaAaBB_2": break; case "AaBBAa_3": break; case "AaBBBB_4": break; case "BBAaAa_5": break; case "BBAaBB_6": break; case "BBBBAa_7": break; case "BBBBBB_8": break; }
(-Djava.compiler = NONE)? Check yourself in a loop! You won’t be able to JIT with such code, but if you play a little, a small difference will be noticeable. - What is the computational complexity of the algorithm for finding the same value among n cases (at least in JDK 7)?
Answers:
- 01
- The hashCode () method returns the same value for all lines of the first switch. Details below.
- Depending on the case, it can be O (1), O (log n) and even reach O (n).
Let's get it right. Case values, for brevity, I will call keys.
Tableswitch
Take an example from the first task. Compile it and disassemble the resulting bytecode.
javap -c Main.class
0: iconst_5 // Засовываем 5 в стек
1: tableswitch { // 1 to 4 // Забираем 3 из стека и ищем в таблице
1: 39 // 39, 56, 32, 49 – адресные метки переходов
2: 56
3: 32
4: 49
default: 32
}
32: getstatic #27 // Field java/lang/System.out:Ljava/io/PrintStream;
35: iconst_0
36: invokevirtual #33 // Method java/io/PrintStream.print:(I)V
39: getstatic #27 // Field java/lang/System.out:Ljava/io/PrintStream;
42: iconst_1
43: invokevirtual #33 // Method java/io/PrintStream.print:(I)V
46: goto 63 // break
49: getstatic #27 // Field java/lang/System.out:Ljava/io/PrintStream;
52: iconst_4
53: invokevirtual #33 // Method java/io/PrintStream.print:(I)V
56: getstatic #27 // Field java/lang/System.out:Ljava/io/PrintStream;
59: iconst_2
60: invokevirtual #33 // Method java/io/PrintStream.print:(I)V
63: return
We are particularly interested in the instruction labeled 1. It means that the compiler created a table (array) containing the transition addresses for all values of the keys from the smallest to the largest, with no exceptions. The switch execution algorithm is something like this (pseudo-code):
if (valhigh){
jump default;
}else{
jump table[val - low];
}
In our case: low == 1, high == 4, table == {39, 56, 32, 49}. Since all keys from low to high should be in the table in sequence, the compiler had to add key 3 and set the same behavior for it as for default.
Starting with instruction 32, the code for all case and default is in the order they appear in the source code. Roughly speaking, here the compiler generates continuous code of key handlers. I think it’s now clear why, after the correspondence is found, execution continues until the first break occurs.
LookupSwitch
A reasonable question arises: what if the values of the keys are very sparse? If we have only two of them: 1 and 1,000,000, then it is extremely unwise to create an array with a million elements. Replace the key 4 in 10 in our example, this will be enough (if suddenly not, increase it). We look at the bytecode (the bytecode of the handlers remained almost the same, therefore not shown):
1: lookupswitch { // 3
1: 43
2: 60
10: 53
default: 36
}
A table is also set here, but for pairs: the key is the transition label. The JVM specification says that although the search can be linear, the keys must be sorted to enable faster searches, although the search method itself is not specified. Perhaps some implementations use linear search. This is the first case I know (albeit theoretical) with the complexity of switch O (n). Next we will see another, more tangible.
Real boys and girls ask: how does the compiler decide what to choose - tableswitch or lookupswitch? And the most real ones download the OpenJDK sources (note that in other JDK implementations the method of choice may differ) and study the visitSwitch method of the com.sun.tools.javac.jvm.Gen.java class in openjdk / langtools / src / share / classes.
// Determine whether to issue a tableswitch or a lookupswitch
// instruction.
long table_space_cost = 4 + ((long) hi - lo + 1); // words
long table_time_cost = 3; // comparisons
long lookup_space_cost = 3 + 2 * (long) nlabels;
long lookup_time_cost = nlabels;
int opcode =
nlabels > 0 &&
table_space_cost + 3 * table_time_cost <=
lookup_space_cost + 3 * lookup_time_cost
?
tableswitch : lookupswitch;
table_space_cost - this size includes the number of all values in the range, plus one value for lo, hi, default_address and the marker of the selected switch method (tableswitch).
table_time_cost - 3 operations: checking the entry into the range (minimum and maximum) and extracting the address label from the table.
lookup_space_cost - 2 values for each key-address pair, plus a value for the size of the table, default_address, and a marker for the selected switch method (lookupswitch).
lookup_time_cost - the worst case is assumed - a linear search.
And the algorithm itself, as you see, is simple. The magic number 3 (“And these people forbid us to pick our nose” (c)) is most likely empirical.
String comparison
“String.hashCode can easily have collisions, String.equals is too slow, maybe the strings are interned?” - So I thought, until I studied the bytecode.
switch(""){
case "AA": break;
case "BB": break;
}
0: ldc #27 // String
2: dup
3: astore_0
4: invokevirtual #29 // Method java/lang/String.hashCode:()I
7: lookupswitch { // 2
2080: 32
2112: 44
default: 73
}
32: aload_0
33: ldc #35 // String AA
35: invokevirtual #37 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
38: ifne 56
41: goto 73
44: aload_0
45: ldc #41 // String BB
47: invokevirtual #37 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
50: ifne 66
53: goto 73
56: getstatic #43 // Field java/lang/System.out:Ljava/io/PrintStream;
59: iconst_1
60: invokevirtual #49 // Method java/io/PrintStream.println:(I)V
63: goto 73
66: getstatic #43 // Field java/lang/System.out:Ljava/io/PrintStream;
69: iconst_2
70: invokevirtual #49 // Method java/io/PrintStream.println:(I)V
73: return
That is, at the compilation stage, the hash code of all keys is calculated. LookupSwitch is always executed for strings, even if hashes are consistent. At execution, the hash code of the conditional expression is calculated and it is it that is compared with the hash keys. If they match, the strings are also compared (addresses 32–53) using the String.equals () method to prevent possible collisions. And, if successful, the transition to the implementation of the corresponding expression (56–70).
And if we have several keys with the same hashes?
switch(""){
case "Aa": break;
case "BB": break;
}
7: lookupswitch { // 1
2112: 24
default: 62
}
24: aload_0
25: ldc #35 // String Aa
27: invokevirtual #37 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
30: ifne 45
33: aload_0
34: ldc #41 // String BB
36: invokevirtual #37 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
39: ifne 55
42: goto 62
45: getstatic #43 // Field java/lang/System.out:Ljava/io/PrintStream;
48: iconst_1
49: invokevirtual #49 // Method java/io/PrintStream.println:(I)V
52: goto 62
55: getstatic #43 // Field java/lang/System.out:Ljava/io/PrintStream;
58: iconst_2
59: invokevirtual #49 // Method java/io/PrintStream.println:(I)V
62: return
Then these keys are combined under one hash key in lookupswitch, and, if the key matches, iterates over all the lines with this hash and compares them using String.equals (). The example from the 2nd question performs as many as 8 comparisons. Here is the second case with complexity O (n).
conclusions
If it weren’t for JIT, we could reason about switch optimization. But I was not able to find a good example in which tableswitch would be noticeably faster than lookupswitch (with JIT enabled). Well, perhaps this:
1. Suppose we have N keys with values from 1 to N. In this case, tableswitch will be used.
2. The same keys, but plus one more, with a value much larger than N. In this case, lookupswitch will be used.
(With JIT turned off, everything is clear, the difference is palpable.)
There is no difference with JIT. Perhaps he breaks all the keys into several ranges and treats them like tableswitch. However, starting from N = 721, I experienced a sharp drop in the performance of the second example.
In conclusion, only very wild advice suggests itself, consider them a joke: “Guys, if you have 1000 cases with consecutive keys except a few in a cycle that should be executed a hundred million times per second, then process these few keys outside of the switch. And if in this loop a bunch of string keys with the same hashes, then think about other implementation methods. "