Java series of things you probably already know: How does Map / HashMap work?

Wednesday, 27/07/2016

This is a common question in Java candidate interviews, and there are also many people because of this question, which is difficult, today we will discuss it.

What is first, Map and HashMap?

Map is a data set stored as key-value. A Map cannot contain duplicate keys, but each key can be mapped to more than one value.

Map interface is defined to include methods that serve basic operations (such as put, get, remove, containsKey, containsValue, size, empty), serve bulk operations (such as putAll, clear), and serve tasks View data in the set (such as keySet, entrySet, values).

HashMap is a implementation of the Map interface in Java. HashMap does not synchronize and is not thread safe. How to implement it, you must be too clear, I will not need to give another example.

HashMap works based on Hashing. To understand hashing, we must understand the concepts of HashFunction , HashValue and Bucket .

So, what is hashing?

Now, let's look at an array of unorganized data, how to search on that array?

Of course, to search, we have to compare all elements of the array. So the complexity will be O (n).

If the array is sorted, applying binary search, the complexity will decrease to O (log n).

In addition, the search can be faster if the function only returns an index of the element in the array. In this case, the complexity is only O (1). This is Hash Function.

A hash function is a function that, when inserted into any value, returns a corresponding value called Hash Value.

Java provides a hash function that is hashCode () . This function is implemented in Object class and of course all classes in Java are available because they inherit from Object.

As mentioned above, to understand Hashing, we have to understand HashFunction, HashValue and Bucket. Now there are two, so what is Bucket?

Bucket is the concept used to indicate where we store key-value pairs. In HashMap, Bucket uses a LinkedList to store.

How is Java HashMap implemented?

In HashMap, the get function (Object key) calls the hashCode () function of the key to retrieve the hashValue of the key, then brings this hashValue to find the corresponding bucket, where key-value is being stored as an Entry object. Look at it, we see the following.

 public V get (Object key) {
  Node <K, V> e;
  return (e = getNode (hash (key), key)) == null? null: e.value;
}

static final int hash (Object key) {
  int h;
  return (key == null)? 0: (h = key.hashCode ()) ^ (h >>> 16);

public V get (Object key) {

Node <K, V> e;

return (e = getNode (hash (key), key)) == null? null: e.value;

}

static final int hash (Object key) {

int h;

return (key == null)? 0: (h = key.hashCode ()) ^ (h >>> 16);

The get (Object key) function also checks whether the key has null . Null is also considered a key in HashMap, and of course only has a null key. If the key is null then the hash is always 0 and the index is also 0. If the key is not null, call the hash function corresponding to the key.

Next, hashValue is used to find the bucket corresponding to where the Entry is being stored. Entry is stored in bucket with hash, key, value and bucket index. Finally, return the value corresponding to the key.

A small note, the complexity of HashMap's get () and put () functions is O (1) because it uses hashCode to find the value.

And now I'm sure you're wondering, the hash is just a number, the likelihood of duplication is there, so how if multiple keys have the same hashValue?

This is why it's important to implement the generic equals () function with hashCode ().

As mentioned above, a bucket is a LinkedList, it is not java.util.Linkedlist so don't misunderstand. HashMap has its own implementation for this. Therefore, if a hashValue refers to a bucket containing multiple Entries it will approve LinkedList and compare the key of each Entry using key.equals () until equals () returns true. Then the value corresponding to the key will be returned.

All classes can be key if it is override equals () and hashCode () functions, but this also helps to turn classes into immutable classes.

Talk about performance.

To talk about performance, for a HashMap there are two factors that will directly affect performance, namely initial capacity and load factor.

Capacity refers to the number of buckets in HashMap.

Load factor is an index to measure the amount of work to certain thresholds, HashMap's capacity will be automatically increased. When the salary entry in HashMap reaches the threshold of overload factor as well as the capacity of HashMap will be rehash. Then HashMap will increase the number of buckets to double. The default value of load factor is 0.75.

In short.

Hopefully, through this article, you have learned about HashMap's activities in Java, and are not embarrassed during the interview interviewed about this issue anymore.

Say goodbye and see you again in the next article.

ITZone via codealohicguy

Share the news now

Java series of things you probably already know: How does Map / HashMap work?

TikTok becomes the second largest social platform in South Africa

The fastest depreciating after 9 months of launch, iPhone 14 Pro Max continues to break the bottom in Vietnam

Beginner's guide to R: Introduction

10 essential SublimeText plugins for JavaScript developers