Java Object Size

    Do you know how much string is in memory? As many as I have not heard the answers to this question, ranging from "I do not know" to "2 bytes * the number of characters in the string." And how much does the empty string take then? Do you know how long an object of class Integer occupies? And how much will your own class object with three Integer fields occupy? It's funny, but not one of my Java programmer friends could answer these questions ... Yes, most of us do not need this at all and no one in real java projects will think about it. But this is, after all, how not to know the engine displacement of the car you drive. You can be a great driver and not even suspect what the numbers 2.4 or 1.6 mean on your car. But I am sure that there are few people who are not familiar with the meaning of these numbers. So why do java programmers know so little about this part of their tool?

    Integer vs int

    We all know that in java - everything is an object. Except, perhaps, primitives and links to the objects themselves. Let's look at two typical situations:
    //первый случай
    int a = 300;
    //второй случай
    Integer b = 301;
    

    In these simple lines, the difference is huge, for both the JVM and OOP. In the first case, all we have is a 4-byte variable that contains the value from the stack. In the second case, we have a reference variable and the object itself to which this variable refers. Therefore, if in the first case we definitely know that the occupied size is:
    sizeOf(int)
    

    then in the second:
    sizeOf(reference) + sizeOf(Integer)
    

    Looking ahead, I’ll say that in the second case, the amount of memory consumed is approximately 5 times larger and depends on the JVM. Now let's see why the difference is so huge.

    What does the object consist of?

    Before determining the amount of memory consumed, you should figure out what the JVM stores for each object:
    • The title of the object;
    • Memory for primitive types;
    • Memory for reference types;
    • Offset / alignment - in fact, these are a few unused bytes that are placed after the data of the object itself. This is done so that the memory address is always a multiple of a machine word, to speed up reading from memory + reduce the number of bits for a pointer to an object + presumably to reduce memory fragmentation. It is also worth noting that in java the size of any object is a multiple of 8 bytes!



    Object Title Structure

    Each instance of the class contains a title. Each title for most JVMs (Hotspot, openJVM) consists of two machine words. If we are talking about a 32-bit system, then the header size is 8 bytes, if we are talking about a 64-bit system, then 16 bytes, respectively. Each heading may contain the following information:
    • Marking word - unfortunately I could not find the purpose of this information, I suspect that this is just a part of the header reserved for the future.
    • Hash Code - each object has a hash code. By default, the result of calling the Object.hashCode () method will return the address of the object in memory, however, some garbage collectors can move objects in memory, but the hash code always remains the same, since the place in the object header can be used for storing the original value of the hash code.
    • Garbage Collection Information - each java object contains the information needed for a memory management system. This is often one or two flag bits, but it can also be, for example, a certain combination of bits to store the number of references to an object.
    • Type Information Block Pointer - contains information about the type of object. This block includes information about the virtual method table, a pointer to an object that represents a type, and pointers to some additional structures, for more efficient interface calls and dynamic type checking.
    • Lock - each object contains information about the lock status. This can be a pointer to a lock object or a direct representation of a lock.
    • Array Length - if the object is an array, the header is expanded with 4 bytes to store the length of the array.


    Java specification

    It is known that primitive types in Java have a predefined size; this is required by the specification for code portability. Therefore, we will not dwell on the primitives, since everything is perfectly described in the link above. And what does the specification say for objects? Nothing, except that each object has a title. In other words, the sizes of instances of your classes may differ from one JVM to another. Actually, for simplicity of presentation, I will give examples on the 32-bit Oracle HotSpot JVM. Now let's look at the most used classes, Integer and String.

    Integer and String

    So, let's try to calculate how much the object of the Integer class will occupy in our 32-bit HotSpot JVM. To do this, you will need to look into the class itself, we are interested in all fields that are not declared as static. From these we see only one thing - int value. Now, based on the information above, we get:
    Заголовок: 8 байт
    Поле int: 4 байта
    Выравнивание для кратности 8 : 4 байта
    Итого: 16 байт
    

    Now take a look at the string class:
        private final char value[];
        private final int offset;
        private final int count;
        private int hash;
    

    And calculate the size:
    Заголовок: 8 байт
    Поля int: 4 байта * 3 == 12 байт
    Ссылочная переменная на объект массива: 4 байта
    Итого: 24 байта
    

    Well, that’s not all ... Since a string contains a link to an array of characters, in fact, we are dealing with two different objects - an object of class String and the array itself that stores the string. This, as it were, is true from the point of view of OOP, but if you look at it from the side of memory, then the size of the array allocated for characters must be added to the received size. And this is 12 more bytes per array object itself + 2 bytes per each character of the string. Well and, of course, do not forget to add alignment for a multiplicity of 8 bytes. In total, the seemingly simple string new String ("a") results in:
    new String()
    Заголовок: 8 байт
    Поля int: 4 байта * 3 == 12 байт
    Ссылочная переменная на объект массива: 4 байта
    Итого: 24 байта
    new char[1]
    Заголовок: 8 байт + 4 байта на длину массива == 12 байт
    Примитивы char: 2 байта * 1 == 2 байта
    Выравнивание для кратности 8 : 2 байта
    Итого: 16 байта
    Итого, new String("a") == 40 байт
    

    It is important to note that new String ("a") and new String ("aa") will occupy the same amount of memory. This is important to understand. A typical example of using this fact to your advantage is the hash field in the String class. If it did not exist, then the string object would somehow occupy 24 bytes, due to alignment. And so it turns out that for these 4 bytes there was a very worthy application. A brilliant decision, isn't it?

    Link Size

    I would like to make a reservation about reference variables. In principle, the size of a link in the JVM depends on its bit depth, I suspect that for optimization. Therefore, in 32-bit JVMs, the size of the link is usually 4 bytes, and in 64-bit JVMs, 8 bytes. Although this condition is not necessary.

    Field grouping

    It should also be noted that the JVM pre-groups the fields of the object. This means that all the fields of the class are placed in memory in a certain order, and not as declared. The grouping order looks like this:
    • 1. 8 byte types (double and long)
    • 2. 4 byte types (int and float)
    • 3. 2-byte types (short and char)
    • 4. One byte types ( boolean and byte)
    • 5. Reference variables


    Why all this?

    Sometimes a situation arises in which you need to estimate the approximate amount of memory for storing various objects, such as a dictionary, this little help will help you quickly navigate. Also, this is a potential optimization method, especially in an environment where access to its settings is not available.

    conclusions

    The topic of memory in java is very interesting and extensive, when I started writing this article, I thought that I would fit into a couple of examples with conclusions. But the farther and deeper you dig, the more and more interesting it becomes. In general, knowing how memory is allocated for objects is a very useful thing, as it will help you save memory, prevent such problems, or optimize your program in places where it seemed impossible. Of course, places where you can use such optimizations are very rare, but still ... I hope the article was interesting to you.

    Also popular now: