
Problem Statement
The task is to design a custom HashSet class named MyHashSet
, simulating the basic functionality of a typical HashSet in programming. The operations that this class needs to support are adding an element, checking the existence of an element, and removing an element. This implementation should avoid using any built-in hash table libraries. The steps to follow for implementing each functionality are:
void add(key)
: This method should insert the integerkey
into the HashSet.bool contains(key)
: This method should return a boolean indicating whether the integerkey
is present in the HashSet.void remove(key)
: This method should remove the integerkey
from the HashSet if it exists; if not, there should be no change and no error.
The HashSet should handle a reasonably high number of operations efficiently, given the constraints.
Examples
Example 1
Input:
["MyHashSet", "add", "add", "contains", "contains", "add", "contains", "remove", "contains"] [[], [1], [2], [1], [3], [2], [2], [2], [2]]
Output:
[null, null, null, true, false, null, true, null, false]
Explanation:
MyHashSet myHashSet = new MyHashSet(); myHashSet.add(1); // set = [1] myHashSet.add(2); // set = [1, 2] myHashSet.contains(1); // return True myHashSet.contains(3); // return False, (not found) myHashSet.add(2); // set = [1, 2] myHashSet.contains(2); // return True myHashSet.remove(2); // set = [1] myHashSet.contains(2); // return False, (already removed)
Constraints
0 <= key <= 106
- At most
104
calls will be made toadd
,remove
, andcontains
.
Approach and Intuition
Given the constraints and methods defined in the problem statement, an effective approach to implementing the MyHashSet
class involves the following key points:
Data Structure Choice:
- Considering the absence of built-in hash libraries, a simplistic primary data structure could be an array (list in Python, vector in C++ or ArrayList in Java). However, for direct access considering the range of key values
0 to 10^6
, a boolean array would be appropriate where the index of the array directly represents the key.
- Considering the absence of built-in hash libraries, a simplistic primary data structure could be an array (list in Python, vector in C++ or ArrayList in Java). However, for direct access considering the range of key values
Initialization:
- Initialize a boolean array large enough to accommodate the maximum key value plus one (to handle the zero index). Initially, all values in this array should be set to
False
indicating none of the keys are present in the set initially.
- Initialize a boolean array large enough to accommodate the maximum key value plus one (to handle the zero index). Initially, all values in this array should be set to
Implementing
add(key)
:- For adding a key, simply set the boolean value at the index representing the key to
True
.
- For adding a key, simply set the boolean value at the index representing the key to
Implementing
contains(key)
:- To check if a key exists, return the boolean value at the index representing the key. If it's
True
, that means the key exists.
- To check if a key exists, return the boolean value at the index representing the key. If it's
Implementing
remove(key)
:- To remove a key, set the boolean value at the index representing the key to
False
.
- To remove a key, set the boolean value at the index representing the key to
In this approach, each operation (add, check, or remove) is achieved in constant time O(1)
, which is efficient given the maximum operations cap of 10^4
. This efficiency is crucial as it ensures that the solution can handle the upper limits of operation calls effectively within reasonable time bounds.
Solutions
- Java
class CustomHashSet {
private Bucket[] buckets;
private int range;
public CustomHashSet() {
this.range = 769;
this.buckets = new Bucket[this.range];
for (int i = 0; i < this.range; i++)
this.buckets[i] = new Bucket();
}
protected int hashFunction(int key) {
return (key % this.range);
}
public void add(int key) {
int idx = this.hashFunction(key);
this.buckets[idx].addKey(key);
}
public void remove(int key) {
int idx = this.hashFunction(key);
this.buckets[idx].removeKey(key);
}
public boolean contains(int key) {
int idx = this.hashFunction(key);
return this.buckets[idx].containsKey(key);
}
}
class Bucket {
private BinarySearchTree bst;
public Bucket() {
bst = new BinarySearchTree();
}
public void addKey(int key) {
this.bst.root = this.bst.insert(this.bst.root, key);
}
public void removeKey(int key) {
this.bst.root = this.bst.delete(this.bst.root, key);
}
public boolean containsKey(int key) {
TreeNode node = this.bst.search(this.bst.root, key);
return (node != null);
}
}
class TreeNode {
int value;
TreeNode left;
TreeNode right;
TreeNode(int x) {
value = x;
}
}
class BinarySearchTree {
TreeNode root = null;
public TreeNode search(TreeNode root, int value) {
if (root == null || value == root.value)
return root;
return value < root.value ? search(root.left, value) : search(root.right, value);
}
public TreeNode insert(TreeNode root, int value) {
if (root == null)
return new TreeNode(value);
if (value > root.value)
root.right = insert(root.right, value);
else if (value == root.value)
return root;
else
root.left = insert(root.left, value);
return root;
}
public TreeNode delete(TreeNode root, int key) {
if (root == null)
return null;
if (key > root.value)
root.right = delete(root.right, key);
else if (key < root.value)
root.left = delete(root.left, key);
else {
if (root.left == null && root.right == null)
root = null;
else if (root.right != null) {
root.value = successor(root);
root.right = delete(root.right, root.value);
} else {
root.value = predecessor(root);
root.left = delete(root.left, root.value);
}
}
return root;
}
public int successor(TreeNode root) {
root = root.right;
while (root.left != null)
root = root.left;
return root.value;
}
public int predecessor(TreeNode root) {
root = root.left;
while (root.right != null)
root = root.right;
return root.value;
}
}
This Java solution implements a custom HashSet using a combination of an array and binary search trees. It's designed to efficiently perform core set operations such as adding, removing, and checking the existence of elements. Here’s a breakdown of key components and their functions:
CustomHashSet Class: This class initializes an array of
Bucket
objects. The size is set to a prime number, 769, to reduce the chances of collisions in a hash table. The hash function employed uses modulo operation to determine the index for each key.Bucket Class: Each bucket contains a binary search tree to store the keys that hash to the same index. This helps manage collisions efficiently by using tree operations to store and retrieve keys.
BinarySearchTree and TreeNode Classes: These classes manage the binary search tree operations within each bucket. The
TreeNode
class represents each node of the tree, and theBinarySearchTree
class includes methods to insert, delete, and search for nodes.
The operations defined include:
add(int key)
: Computes the bucket index for the key using the hash function and adds the key to the appropriate bucket.remove(int key)
: Identifies the correct bucket and removes the key from it.contains(int key)
: Checks whether a specific key exists in the set by searching for it in the corresponding bucket.
Efficiency considerations:
- Using a combination of hashing and binary search trees allows the set operations to handle collisions more gracefully and maintain efficiency in terms of time complexity, especially for unbalanced data.
Use cases:
- This implementation can be particularly useful in scenarios where simple array or linked-list based approaches to hash sets may lead to high collision rates, affecting the performance of basic operations like add, remove, and contains.
No comments yet.