Hash Table

What is a Hashtable?

Hashing is a technique or process of mapping keys/values into the hash table by using a hash function. It is done for faster access to elements. The efficiency of mapping depends on the efficiency of the hash function used.
Hash tables are a data structure that utilizes key/value pairs. This means every Node or Bucket has both a key and a value.

What is the idea of a Hash table?

The ability to store key/ value pairs and retrieve the data using the key. This can be done with hash.
- Hash is the ability to encode the key, to reserve a specific location into the data structure, so we can use this hash to retrieve the value directly.
The hash tables make retrieving data very fast because the time complexity will be O(1).

Hash table terminology:

Before we start we should learn hash table terminologies:
1. Hash: A hash is the result of some algorithm taking an incoming string and converting it into a value that could be used for either security or some other purpose. In the case of a hash table, it is used to determine the index of the array.
2. Buckets:A bucket is what is contained in each index of the array of the hash table. Each index is a bucket. An index could potentially contain multiple key/value pairs if a collision occurs.
3. Collisions: A collision is what happens when more than one key gets hashed to the same location of the hash table.

Hashing Structure

Hashing
- As we mentioned, the main job of hashing is to turn the key into an integer number, When we do that we should be aware of a couple of things:
  - It’s very important that hash codes are deterministic: their output is determined only by their input.
  - Hash codes should never have randomness.
  - The same key should always produce the same hash code.
Creating a Hash

A hashtable traditionally is created from an array. To create a hash table do the following:
1. Create an array of size 1024 (the size is important for index placement).
2. Do some sort of logic to turn that “key” into a numeric number value. Here is a possible suggestion:
  1. Add or multiply all the ASCII values together.
  2. Multiply it by a prime number such as 599.
  3. Use modulo to get the remainder of the result, when divided by the total size of the array.
  4. Insert into the array at that index.

For example:

    Key = "Cat"
    Value = "Josie"

    67 + 97 + 116 = 280

    280 * 599 = 69648

    69648 % 1024 = 16

    Key gets placed in an index of 16.

Dive into HOW the key/values are stored in the array.
- As we said above, the bucket refers to the index when we working with hash tables.
- Each bucket holds a key/value pair combination.
- When there is no entry in a specific bucket, the initial value of the bucket will be NULL.
- The hash table starts each bucket empty and overwrites its value once a key generates a hashCode that corresponds with an index.
- Collisions
- What is Collision?
  - Basically, collision refer to the state where we generate two equaled hash.
- What would happen?
  - If the collision didn't handle, the last hash will overwrite the bucket.
- How to handle collision
  - Collision can be handled by changing the initial state of the buckets.
  - So, instead of starting all the buckets as NULL, we can initialize a LinkedList in each one.
  - Now, if we generated two equaled hash keys, the key/value pairs can be stored as a node in a linked list.
  - So now, the term bucket would be much cleaner.
  - Since different keys can lead to the same bucket it’s important to store the entire key/value pair in the bucket, not just the value.
How are hash maps stored and read?
- To store hash maps do this:
  - Accept a key
  - Calculate the hash of the key
  - Use modulus to convert the hash into an array index
  - Store the key with the value by appending both to the end of a linked list
- To read hash map do this:
  - Accept a key
  - Calculate the hash of the key
  - Use modulus to convert the hash into an array index
  - Use the array index to access the short LinkedList representing a bucket
  - Search through the bucket looking for a node with a key/value pair that matches the key you were given.

Hash table methods

Hash table should have main methods:

GetHash method
Add method
Find method
Contains method
Remove method

Code implementation

Step 1: Create the Node class

class Node:
    def __init__(self, key, value):
        self.key = key
        self.value = value
        self.next = None

Step 2: Create the HashTable class

You can change the capacity to satisfies your requirements.

class HashTable:
  def __init__(self):
      self.capacity = 70
      self.size = 0
      self.buckets = [None] * self.capacity

Step 3: Create the Hash algorithm.

def get_hash(self, key):
    hash_sum = 0
    # For each character in the key
    for idx, char in enumerate(key):
        # Add (index + length of key) ^ (current char code)
        hash_sum += (idx + len(key)) ** ord(char)
        # Perform modulus to keep hash_sum in range [0, self.capacity - 1]
        hash_sum = hash_sum % self.capacity
    return hash_sum

Step 4: Create the insert method.

def insert(self, key, value):
    # 1. Increment size
    self.size += 1
    # 2. Compute index of key
    index = self.get_hash(key)
    # Go to the node corresponding to the hash
    node = self.buckets[index]
    # 3. If bucket is empty:
    if node is None:
        # Create node, add it, return
        self.buckets[index] = Node(key, value)
        return
    # 4. Collision! Iterate to the end of the linked list at provided index
    prev = node
    while node is not None:
        prev = node
        node = node.next
    # Add a new node at the end of the list with provided key/value
    prev.next = Node(key, value)

Step 5: Create the find method

    def find(self, key):
        # 1. Compute hash
        index = self.get_hash(key)
        # 2. Go to first node in list at bucket
        node = self.buckets[index]
        # 3. Traverse the linked list at this node
        while node is not None and node.key != key:
            node = node.next
        # 4. Now, node is the requested key/value pair or None
        if the node is None:
            # Not found
            return None
        else:
            # Found - return the data value
            return node.value

Step 6: Create remove method.

def remove(self, key):
    # 1. Compute hash
    index = self.get_hash(key)
    node = self.buckets[index]
    prev = None
    # 2. Iterate to the requested node
    while node is not None and node.key != key:
        prev = node
        node = node.next
    # Now, node is either the requested node or none
    if node is None:
        # 3. Key not found
        return None
    else:
        # 4. The key was found.
        self.size -= 1
        result = node.value
        # Delete this element in linked list
        if prev is None:
            node = None
        else:
            prev.next = prev.next.next
        # Return the deleted language
        return result

Representation of the hash table

Hash Tables

Table of contents

Hash Table

What is a Hashtable?

What is the idea of a Hash table?

Hash table terminology:

Hashing Structure

Hashing

Creating a Hash

Dive into HOW the key/values are stored in the array.

Collisions

How are hash maps stored and read?

Hash table methods

Code implementation

Representation of the hash table

Resources