Understand the concept of Hash Table

Tram Ho

Perhaps this concept is not too unfamiliar to the Engineer brothers and myself after the first 2 years working and hearing about this concept for the first time is also very vaguely understood. Yeah and of course don’t let the pain get longer (I’m curious, actually), so I also learn about Hash Table ‘s working methods and share with ae to have a closer look at Hash Table to apply to. your work in the most effective way

First, I would like to briefly explain: Hash Table is a set of unordered key-value pairs, and each key is unique. Hash tables are often used to deploy structured collection data such as Map, Set … And can be easily seen in popular programming languages ​​such as Java, C ++, Python, Go. A regular hash table has basic operations including: find, add, and delete. Array or Linked List will not achieve the following:

  • Searching in an unordered array at worst case takes a linear amount of time for the length of the array.
  • In an ordered array, the search will be very fast if we use the Binary Search algorithm but the trade-off and we add an element to the array is ineffective.
  • In the case of a linked list, in contrast to add / remove an element is quite simple, but the search is also time-consuming like the case of an unordered array.

And based on that property, the hash table uses a series of Linked Lists to solve the problem of data collisions. Hash Table guy is drawn from the strengths of Array and Linked List

How Hash Table works will include 2 steps:

  1. The key is converted to a numeric index using the hash function
  2. This index determines which Linked Lists will contain the corresponding key-value pairs

To make it easier to imagine, let’s see a simple example below. We will have a data structure containing up to 1000 records with the key a random integer. And to distribute the data evenly, we’ll use multiple short lists. With all the records with key ending in 000 will belong to 1 list Eg: 1000, 2000, 3000 …. all records with key ending with 001 will belong to another list eg 1001, 2001 … and Continuing like that we have 1000 lists in the same way as above. And can be implemented under the coding as follows:

var table = new LinkedList[1000]

Here LinkedList represents a list of linked key-value pairs. The fact that we add a record key-value includes 2 steps:

  1. We extract the last 3 digits of the key hash = key % 1000
  2. We then add this key-value pair to the table[hash]

The above action always takes a constant amount of time. And what we’re looking for will look like this

Because the keys of a key-value pair are random, we can consider the number of records in each list to be nearly the same. And since we have 1000 lists and at most 1000 records, there will be very few records in a list table[key%1000] and of course the search will be very fast. Ae can see that the Time Complexity of finding and adding new is O (1) . Just like the above method, the deletion also takes a constant time

Through the above example we also see how Hash Table combines both Array and Linked List to store data. In the next article we will go deeper to find more practical examples of Hash Table and Amortized Constant Time Performance concept.

I would like to thank all of you for taking the time to read this article and look forward to receiving feedback as well as sharing your contributions. Wish you a happy Tet with your family and remember to stay healthy while COVID is still complicated.

Happy New Year !!!

Share the news now

Source : Viblo