How to expand a class file

    Typically, compiling a Java file produces .class files of approximately the same size as the source. I was interested in whether it is possible to make a .class file from a small source, which is larger, much larger than the source.

    You can look for some short language constructs that compile into long bytecode chains, but the linear increase did not suit me. I immediately thought about compiling finally-blocks: they already wrote about it on Habr. In short, for each finally block with a non-empty try block, at least two options are created in the bytecode: for the case of normal completion of the try block and for the case of completion with an exception. In the latter case, the exception is stored in a new local variable, the finally code is executed, then the exception is taken from the local variable and thrown. But what if inside the finally place try-finally again and so on? The result exceeded all expectations.

    I compiled using Oracle javac 1.7.0.60 and 1.8.0.25, the results were almost the same. The path for exclusion is formed even if there is absolutely nothing reprehensible in the try block. For example, assigning an integer constant to a local variable is two of the iconst and istore statements, not one of them in the specification says that they can throw an exception. So we will write:
    class A {{
      int a;
      try {a=0;} finally {
      try {a=0;} finally {
      try {a=0;} finally {
      try {a=0;} finally {
      try {a=0;} finally {
      try {a=0;} finally {
      try {a=0;} finally {
      try {a=0;} finally {
      try {a=0;} finally {
      try {a=0;} finally {
      try {a=0;} finally {
      try {a=0;} finally {
      a=0;
      }}}}}}}}}}}}
    }}

    Adding new non-trivial code to the innermost finally causes a code too large compilation error, so let's limit ourselves to this. If someone has forgotten, this is our initialization block , which is glued to each constructor. There is no sense in declaring a method for our task.

    This source code takes 336 bytes, and the resulting class-file has expanded up to 6 571 429 bytes, that is, 19 557 times (we will call this a growth factor). Even when disabling all debugging information using -g: none class-file weighs 6 522 221 bytes, which is slightly less. Let's see what's inside with the javap utility .

    Pool of constants

    The pool of constants turned out to be small: only 16 entries. Essentially, everything you need: attribute names of type Code, class name, Java file, link to the constructor of the parent class Object, etc. When you turn off debugging information, three entries disappear: the attribute names LineNumberTable, SourceFile and the value A.java for the SourceFile attribute .

    The code

    The default constructor code was 64507 bytes, almost abutting the maximum allowable limit. It starts with normal execution:
    The code
             0: aload_0
             1: invokespecial #1                  // Method java/lang/Object."":()V
             4: iconst_0
             5: istore_1
             6: iconst_0
             7: istore_1
             8: iconst_0
             9: istore_1
            10: iconst_0
            11: istore_1
            12: iconst_0
            13: istore_1
            14: iconst_0
            15: istore_1
            16: iconst_0
            17: istore_1
            18: iconst_0
            19: istore_1
            20: iconst_0
            21: istore_1
            22: iconst_0
            23: istore_1
            24: iconst_0
            25: istore_1
            26: iconst_0
            27: istore_1
            28: iconst_0
            29: istore_1
            30: goto          38
    

    That is, the constructor of the parent class is called, and then the unit is written 13 times to the first local variable. After that, a long goto chain begins, which bypasses all other copies of finally: 30-> 38-> 58-> 104-> 198-> 388-> 770-> 1536-> 3074-> 7168-> 15358-> 31740- > 64506, and at 64506 we find the long-awaited return statement.

    Between these goto are all sorts of combinations of normal and exceptional completions of each try block. Unexpectedly, for each finally that processes the exception, a new local variable is created to store the exception, even if the blocks are obviously mutually exclusive. Because of this, the code requires 4097 local variables. Small statistics on instructions:

    • iconst_1 - 8191 times
    • istore_1 - 8191 times
    • goto - 4095 times
    • athrow - 4095 times
    • astore_2 / aload_2 - 1 time
    • astore_3 / aload_3 - 1 time
    • astore / aload - 252 times (local variables with numbers from 4 to 255)
    • astore_w / aload_w - 3841 times (local variables with numbers greater than 255)

    Plus one aload_0, one invokespecial and one return - a total of 32765 instructions. Those who wish can draw a control flow graph and hang it on the wall.

    Exception table

    The exclusion table contains entries of the form (start_pc, end_pc, handler_pc, catch_type) and tells the virtual machine "if an exception of the catch_type type occurred while executing instructions from the start_pc address to the end_pc address, then transfer control to the handler_pc address". In this case, catch_type is everywhere equal to any, that is, exceptions of any type. The entries in table 8188 and it takes about the same as the code - about 64 kilobytes. The beginning looks like this:
             from    to  target type
                26    28    33   any
                24    26    41   any
                42    44    49   any
                49    51    49   any
                22    24    61   any
    


    Line number table

    The line number table is debugging information that maps the addresses of the bytecode instructions to the line numbers in the source. It has 12288 entries and most often comes across links to a line with the innermost finally. It takes about 48 kilobytes.

    Stackmaptable

    Where did the rest of the place go? He was occupied by the StackMapTable table , which is necessary for verifying the class file. If it’s completely rude, for each branch point in the code, this table contains the types of elements in the stack and the types of local variables at this point. Since we have a lot of local variables and branch points, too, the size of this table grows quadratically from the size of the code. If local variables for exceptions in disjoint branches were reused, they would only need 13 and the StackMapTable table would be much more modest in size.

    We stare further

    Is it possible to expand the class file even more? Of course, you can copy the method containing the nested try-finally. But the compiler can very well do it for us. Recall that the initialization block is glued to each constructor automatically. It is enough to add a lot of empty constructors with different signatures to the code. Be careful here, otherwise the compiler will run out of memory. Well, you can write modestly this way by packing the code in one line:

    class A{{int a;try{a=0;}finally{try{a=0;}finally{try{a=0;}finally{try{a=0;}finally{try{a=0;}finally{try{a=0;}finally{try{a=0;}finally{try{a=0;}finally{try{a=0;}finally{try{a=0;}finally{try{a=0;}finally{try{a=0;}finally{a=0;}}}}}}}}}}}}}A(){}A(int a){}A(char a){}A(double a){}A(float a){}A(long a){}A(short a){}A(boolean a){}A(String a){}A(Integer a){}A(Float a){}A(Short a){}A(Long a){}A(Double a){}A(Boolean a){}A(Character a){}}

    Here I have 16 constructors, the source code takes 430 bytes . After compilation we have 104,450,071 bytes , rastaraschivaniya ratio was 242,907 . And this is not the limit!

    Also popular now: