HashMap internal work in Java
- Tutorial
[Note from the author of the translation] The translation was made for one’s own needs, but if it turns out to be useful to someone, then the world has become a little bit better! Original article - Internal Working of HashMap in Java
In this article we will see how the get and put methods in the HashMap collection work from the inside. What operations are performed. How is hashing. How value is retrieved by key. How key-value pairs are stored.
As in the previous article , HashMap contains an array of Node and Node can represent a class containing the following objects:
- int - hash
- K - key
- V - value
- Node - the next item
Now we will see how it all works. To begin, we will look at the hashing process.
Hashing
Hashing is the process of converting an object into an integer form, performed using the hashCode () method. It is very important to implement the hashCode () method correctly to ensure the best performance of the HashMap class.
Here I use my own Key class and thus can override the hashCode () method to demonstrate various scenarios. My Key class:
// специальный класс Key для переопределени методов hashCode()// и equals()classKey{
String key;
Key(String key)
{
this.key = key;
}
@OverridepublicinthashCode(){
return (int)key.charAt(0);
}
@Overridepublicbooleanequals(Object obj){
return key.equals((String)obj);
}
}
Here, the overridden hashCode () method returns the ASCII code of the first character of the string. Thus, if the first characters of a string are the same, then the hash codes will be the same. Do not use similar logic in their programs.
This code is created solely for demonstration. Since HashCode allows for a null key, the hash code null will always be 0.
HashCode () method
The hashCode () method is used to get the object hash code. The hashCode () method of the Object class returns a reference to the memory of an object in integer form (identity hash code ). Signature method public native hashCode()
. This suggests that the method is implemented as native, since there is no method in java that allows you to get a reference to an object. You can define your own implementation of the hashCode () method. In the HashMap class, the hashCode () method is used to calculate the bucket and therefore calculate the index.
Equals () method
The equals method is used to test two objects for equality. The method is implemented in the Object class. You can redefine it in your own class. In the HashMap class, the equals () method is used to check the equality of keys. If the keys are equal, the equals () method returns true, otherwise false.
Baskets (Buckets)
Bucket is the only element of the HashMap array. It is used to store nodes (Nodes). Two or more nodes can have the same bucket. In this case, the linked list is used to link the nodes . Bucket-y differ in capacity (capacity property). The relationship between bucket and capacity is as follows:
capacity = number of buckets * load factor
One bucket can have more than one node, it depends on the implementation of the hashCode () method. The better your hashCode () method is implemented, the better your bucket s will be used.
HashMap Index Calculation
The hash key code can be large enough to create an array. The generated hash code may be in the range of an integer type, and if we create an array of this size, we will easily get an outOfMemoryException exception. Therefore, we generate an index to minimize the size of the array. In essence, the following operation is performed to calculate the index:
index = hashCode(key) & (n-1).
where n is equal to the number of the bucket or the value of the length of the array. In our example, I consider n as the default value of 16.
- initially empty hashMap: here the size of the hashmap is 16:
HashMap map = new HashMap();
HashMap:
- Key pair insertion - Value: add one key pair - value to the end of a HashMap
map.put(new Key("vishal"), 20);
Steps:
Calculate the value of the key {"vishal"}. It will be generated as 118.
Calculate the index using the method
index
that will be equal to 6.Create a node object.
{ int hash = 118// {"vishal"} не строка, а// объект класса Key Key key = {"vishal"} Integer value = 20 Node next = null }
Place the object in position with index 6, if the place is free.
Now HashMap looks like this:
- add another pair of key - value: now add another pair
map.put(new Key("sachin"), 30);
Steps:
Calculate the value of the key {"sachin"}. It will be generated as 115.
Calculate the index using the method
index
that will be equal to 3.Create a node object.
{ int hash = 115 Key key = {"sachin"} Integer value = 30 Node next = null }
Place the object in position with index 3, if the place is free.
Now HashMap looks like this:
- in case of collisions: now add another pair
map.put(new Key("vaibhav"), 40);
Steps:
Calculate the value of the key {"vaibhav"}. It will be generated as 118.
Calculate the index using the method
index
that will be equal to 6.Create a node object.
{ int hash = 118 Key key = {"vaibhav"} Integer value = 20 Node next = null }
Place the object in position with index 6, if the place is free.
In this case, in the position with index 6, another object already exists, this case is called a collision.
In this case, check using the hashCode () and equals () methods that both keys are the same.
If the keys are the same, replace the current value with a new one.
Otherwise, bind the new and old objects using the “linked list” data structure, indicating the link to the next object in the current one and keep both under index 6.
Now HashMap looks like this:
[Note from the author of the translation] The image is taken from the original article and initially contains an error. The link to the next object in the vishal object with the index 6 is not equal to null, it contains a pointer to the vaibhav object.
- we get the value by the sachin key:
map.get(new Key("sachin"));
Steps:
Calculate the hash code for the {“sachin”} object. It has been generated as 115.
Calculate the index using the method
index
that will be equal to 3.Go to index 3 and compare the key of the first element with the existing value. If they are equal, turn the value; otherwise, check for the next element, if it exists.
In our case, the element is found and the return value is 30.
- we get the value by key vaibahv:
map.get(new Key("vaibhav"));
Steps:
Calculate the hash code of the object {"vaibhav"}. It was generated as 118.
Calculate the index using the method
index
that will be equal to 6.Go to index 6 and compare the key of the first element with the existing value. If they are equal, turn the value; otherwise, check for the next element, if it exists.
In this case, it is not found and the next node object is not null.
If the next node object is null, return null.
If the next node object is not null, go to it and repeat the first three steps until the element is found or the next node object is null.
// Java программа для иллюстрации// внутренней работы HashMapimport java.util.HashMap;
classKey{
String key;
Key(String key)
{
this.key = key;
}
@OverridepublicinthashCode(){
int hash = (int)key.charAt(0);
System.out.println("hashCode for key: "
+ key + " = " + hash);
return hash;
}
@Overridepublicbooleanequals(Object obj){
return key.equals(((Key)obj).key);
}
}
// Driver classpublicclassGFG{
publicstaticvoidmain(String[] args){
HashMap map = new HashMap();
map.put(new Key("vishal"), 20);
map.put(new Key("sachin"), 30);
map.put(new Key("vaibhav"), 40);
System.out.println();
System.out.println("Value for key sachin: " + map.get(new Key("sachin")));
System.out.println("Value for key vaibhav: " + map.get(new Key("vaibhav")));
}
}
Conclusion:
hashCode for key: vishal = 118
hashCode for key: sachin = 115
hashCode for key: vaibhav = 118
hashCode for key: sachin = 115
Value for key sachin: 30
hashCode for key: vaibhav = 118
Value for key vaibhav: 40
Java 8 changes
As we already know, in the event of a collision, the node object is stored in the "linked list" data structure and the equals () method is used to compare keys. These are comparisons for finding the right key in a linked list — a linear operation and, in the worst case, the complexity is O (n) .
To fix this problem in Java 8 after reaching a certain threshold, instead of linked lists, balanced trees are used. This means that the HashMap at the beginning keeps the objects in a linked list, but after the number of elements in the hash reaches a certain threshold, the transition to balanced trees occurs. That improves performance at worst from O (n) to O (log n).
Important point
- The complexity of get () and put () operations is almost constant until re-hashing is performed.
- In the case of collisions, if the indices of two or more node objects are the same, the node objects are connected using a linked list, i.e. the reference to the second node object is stored in the first, the third in the second, and so on.
- If the key already exists in the HashMap, the value is overwritten.
- The hash code is null 0.
- When an object is retrieved by a key, a transition is made to the linked list until an object is found or the reference to the next object is null.