Hashing Search Algorithm

Hi everyone, inside this article we will see the concept about Hashing Search Algorithm.

Hashing is a search algorithm used to quickly find the location of a data record in a table or database. It works by taking a value (known as the key) and using a hash function to compute an index into an array of buckets or slots, where the corresponding record can be found.

The hash function takes the key and performs some mathematical operations to convert it into a numerical value that can be used as an index. The resulting index is used to access the appropriate bucket or slot, where the desired record can be found.

How Algorithm of Hashing Search works?

The algorithm for hashing search typically involves the following steps:

  1. First, a hash function is chosen. The hash function takes the key as input and returns an index into the array of buckets.
  2. The key-value pair is inserted into the appropriate bucket, based on the hash value.
  3. When searching for a key, the hash function is used again to compute the index of the bucket where the key might be located.
  4. If the key is found in the bucket, the corresponding value is returned.
  5. If the key is not found in the bucket, the search algorithm may need to search other buckets or perform additional steps to resolve collisions.
  6. When deleting a key-value pair, the hash function is used to locate the bucket, and the pair is removed from the bucket.

The efficiency of the algorithm depends on the quality of the hash function and the number of collisions that occur. If the hash function produces a good distribution of hash values and collisions are rare, then the search algorithm can be very efficient. However, if the hash function is poorly designed or collisions are frequent, then the search algorithm can become slow and inefficient. To mitigate the impact of collisions, various techniques can be used, such as open addressing or chaining.

Here’s a pseudocode implementation of hashing search:

function hashingSearch(key, hashTable):
    // Compute hash value
    hashValue = hashFunction(key)
    
    // Search for key in bucket
    if hashTable[hashValue] is not null:
        for pair in hashTable[hashValue]:
            if pair.key == key:
                return pair.value
    
    // Key not found
    return null

In this implementation, key is the key that we are searching for, and hashTable is the table or array of buckets where the key-value pairs are stored. The hashFunction is a user-defined function that takes the key as input and returns a hash value, which is used to index into the hash table.

Example Code

Here’s an example of a C++ program that performs hashing search:

#include <iostream>
#include <vector>

using namespace std;

// Define a key-value pair struct
struct Pair
{
  int key;
  string value;
};

// Define a hash table class
class HashTable
{
public:
  // Constructor with table size as input
  HashTable(int tableSize)
  {
    this->tableSize = tableSize;
    table.resize(tableSize);
  }

  // Hash function to compute index from key
  int hashFunction(int key)
  {
    return key % tableSize;
  }

  // Insert key-value pair into hash table
  void insert(int key, string value)
  {
    int index = hashFunction(key);
    table[index].push_back({key, value});
  }

  // Search for key in hash table
  string search(int key)
  {
    int index = hashFunction(key);
    for (auto pair : table[index])
    {
      if (pair.key == key)
      {
        return pair.value;
      }
    }
    return "Key not found";
  }

private:
  int tableSize;
  vector<vector<Pair>> table;
};

int main()
{
  HashTable table(5);

  table.insert(1, "apple");
  table.insert(2, "banana");
  table.insert(3, "orange");

  cout << table.search(2) << endl; // Output: "banana"
  cout << table.search(4) << endl; // Output: "Key not found"

  return 0;
}

The above program will output:

banana
Key not found

In this program, the Pair struct represents a key-value pair. The HashTable class contains a vector of vectors to represent the hash table, where each inner vector contains a list of Pair objects that have the same hash value.

The HashTable class contains a constructor that takes the size of the hash table as input, a hashFunction method to compute the index from the key, an insert method to insert key-value pairs into the hash table, and a search method to search for a key in the hash table.

In the main function, we create a HashTable object with a size of 5, insert some key-value pairs, and then search for a couple of keys using the search method. The program outputs the corresponding values for the keys, or “Key not found” if the key is not present in the hash table.

Complexity Analysis of Hashing Search

The time complexity of hashing search depends on the quality of the hash function and the number of collisions that occur.

In the best-case scenario, where there are no collisions and the hash function produces a unique index for each key, the time complexity of hashing search is O(1). This is because the search algorithm can compute the index of the bucket that contains the key in constant time, and then retrieve the value from the bucket in constant time as well.

In the worst-case scenario, where all keys have the same hash value and are stored in the same bucket, the time complexity of hashing search becomes O(n), where n is the number of keys in the bucket. In this case, the search algorithm must examine each key-value pair in the bucket to find the key that we are searching for.

In practice, the quality of the hash function and the number of collisions depend on the size of the hash table, the number of keys, and the distribution of the keys. By choosing a good hash function and an appropriate hash table size, we can minimize the number of collisions and achieve an average-case time complexity of O(1) or close to it. Additionally, techniques like chaining or open addressing can be used to handle collisions and further improve the efficiency of the algorithm.

When is the Hashing Search algorithm used?

Hashing search is used when we need to search for a key in a large collection of data that can be represented as key-value pairs. This algorithm is particularly useful when we need to perform frequent search operations and when the size of the data set is large.

Hashing search is often used in databases and information retrieval systems, where we need to quickly search for data based on a unique key. For example, a database management system may use hashing search to quickly retrieve records based on a primary key. Information retrieval systems may use hashing search to retrieve documents based on a unique identifier or keyword.

Hashing search algorithm has several advantages and disadvantages:

Advantages:

  1. Fast search time: Hashing search provides constant time complexity for searching in the best case scenario. This makes it ideal for large data sets where search time is a critical factor.
  2. Efficient storage: Hashing search requires less memory space compared to other search algorithms like linear search or binary search. This is because it stores only the key-value pairs in the hash table, rather than the entire data set.
  3. Good performance on average: Hashing search provides good performance on average for a wide range of data sets when the hash function is chosen carefully.
  4. Easy implementation: Hashing search is easy to implement and understand, and many programming languages provide built-in hash table data structures.

Disadvantages:

  1. Limited search flexibility: Hashing search is primarily designed for exact match search based on unique keys. It is not suitable for range search or partial match search.
  2. Collision handling: Hashing search may suffer from collisions, which occur when two keys are mapped to the same hash value. Collisions can be resolved through chaining or open addressing, but this can add complexity to the algorithm.
  3. Hash function selection: Hashing search requires a good hash function to ensure even distribution of keys across the hash table. Selecting a poor hash function can result in poor performance and higher collision rates.
  4. Hash table resizing: If the data set changes frequently, the hash table may need to be resized to ensure optimal performance.

We hope this article helped you to understand Hashing Search Analysis in a very detailed way.

Online Web Tutor invites you to try Skillshike! Learn CakePHP, Laravel, CodeIgniter, Node Js, MySQL, Authentication, RESTful Web Services, etc into a depth level. Master the Coding Skills to Become an Expert in PHP Web Development. So, Search your favourite course and enroll now.

If you liked this article, then please subscribe to our YouTube Channel for PHP & it’s framework, WordPress, Node Js video tutorials. You can also find us on Twitter and Facebook.