Shortest Word Distance II

Updated on 23 June, 2025
Shortest Word Distance II header image

Problem Statement

In this task, we are tasked with designing a data structure, WordDistance, which initializes with an array of strings called wordsDict. The primary function of this data structure is to efficiently compute and return the shortest distance between any two different predetermined strings from the array.

This distance is measured as the number of positions between the two strings in the array. The initialization function, WordDistance(String[] wordsDict), sets up the data structure with wordsDict. Another function, int shortest(String word1, String word2), is designed to return the shortest distance between the occurrences of word1 and word2 in the array.

By implementing this data structure, queries about distances between words can be answered rapidly after an initial setup using the given array of words.

Examples

Example 1

Input:

["WordDistance", "shortest", "shortest"]
[[["practice", "makes", "perfect", "coding", "makes"]], ["coding", "practice"], ["makes", "coding"]]

Output:

[null, 3, 1]

Explanation:

WordDistance wordDistance = new WordDistance(["practice", "makes", "perfect", "coding", "makes"]);
wordDistance.shortest("coding", "practice"); // returns 3
wordDistance.shortest("makes", "coding");    // returns 1

Constraints

  • 1 <= wordsDict.length <= 3 * 10^4
  • 1 <= wordsDict[i].length <= 10
  • wordsDict[i] consists of lowercase English letters.
  • word1 and word2 are in wordsDict.
  • word1 != word2
  • At most 5000 calls will be made to shortest.

Approach and Intuition

The WordDistance class is designed to handle frequent distance queries efficiently by preprocessing the wordsDict array during initialization. Here's a closer look at the approach and the underlying intuition:

  1. Initialization Preprocessing:

    • When an instance of WordDistance is created with wordsDict, the constructor preprocesses wordsDict to create a lookup table.
    • This involves mapping each word to all the indices at which it appears in wordsDict. Storing this information up front allows for quick distance calculations later.
  2. Distance Calculation Strategy:

    • To find the shortest distance between two words, word1 and word2, the stored indices from the initialization step are retrieved.
    • Calculate the minimal distance by iterating through the lists of indices for both words simultaneously, comparing the indices pair-wise to find the smallest absolute difference.
    • This approach eliminates the need to search through wordsDict repeatedly for subsequent queries, thereby making the shortest() method efficient, particularly under the constraint of up to 5000 calls.
  3. Performance Considerations:

    • The preprocess step has a time complexity of O(N) where N is the length of wordsDict.
    • Each call to shortest() should operate in linear time relative to the number of occurrences of the two queried words in wordsDict, making it efficient due to direct access to their indices without needing to traverse the entire array.

In essence, the problem is approached by setting up an efficient data retrieval system in the initialization phase allowing fast responses to queries on word distances. This minimizes the processing time per query, ensuring responsiveness even with repetitive calls.

Solutions

  • Java
  • Python
java
class WordLocator {

    HashMap<String, ArrayList<Integer>> wordIndices;

    public WordLocator(String[] words) {
        this.wordIndices = new HashMap<String, ArrayList<Integer>>();

        for (int idx = 0; idx < words.length; idx++) {
            ArrayList<Integer> indices = this.wordIndices.getOrDefault(words[idx], new ArrayList<Integer>());
            indices.add(idx);
            this.wordIndices.put(words[idx], indices);
        }
    }

    public int findClosest(String word1, String word2) {
        ArrayList<Integer> indices1, indices2;
        indices1 = this.wordIndices.get(word1);
        indices2 = this.wordIndices.get(word2);

        int i1 = 0, i2 = 0, minDistance = Integer.MAX_VALUE;
        while (i1 < indices1.size() && i2 < indices2.size()) {
            minDistance = Math.min(minDistance, Math.abs(indices1.get(i1) - indices2.get(i2)));
            if (indices1.get(i1) < indices2.get(i2)) {
                i1++;
            } else {
                i2++;
            }
        }

        return minDistance;
    }
}

The solution outlined in the Java code solves the problem of finding the shortest distance between the occurrences of two specific words in a list. It encapsulates the functionality within the WordLocator class.

  • The constructor, WordLocator(String[] words), initializes a HashMap where the keys are the words from the input array and the values are lists containing the indices of the words’ occurrences in the array. This setup allows for efficient lookups and is crucial for the performance of the solution.

  • The method findClosest(String word1, String word2) computes the shortest distance between the positions of word1 and word2. It achieves this by:

    • Pulling the pre-stored lists of indices for both words.
    • Using two pointers to iterate through these lists, it calculates the absolute difference between the current indices to find the minimum distance.
    • The pointers are adjusted dynamically—incrementing the pointer which points to the smaller index to possibly find a closer match in subsequent steps.

This approach is efficient as it reduces the problem to a simple list traversal once the hash map is constructed, leveraging the precomputed indices to provide fast access and comparison. The design ensures that the word lookup and distance calculation are optimized, making the solution suitable for scenarios involving frequent queries for different word pairs after the initial setup.

python
from collections import defaultdict

class DistanceFinder:

    def __init__(self, words: List[str]):
        self.word_indices = defaultdict(list)

        # Associate each word with its list of occurrences
        for index, word in enumerate(words):
            self.word_indices[word].append(index)

    def find_minimum_distance(self, word1: str, word2: str) -> int:
        indices1, indices2 = self.word_indices[word1], self.word_indices[word2]
        idx1, idx2 = 0, 0
        minimum_distance = float("inf")

        # Compute the smallest distance between indices
        while idx1 < len(indices1) and idx2 < len(indices2):
            minimum_distance = min(minimum_distance, abs(indices1[idx1] - indices2[idx2]))
            if indices1[idx1] < indices2[idx2]:
                idx1 += 1
            else:
                idx2 += 1
        return minimum_distance

The Shortest Word Distance II problem requires a solution for finding the minimized distance between any two specified words in a list. In the provided Python3 code, the approach involves:

  • Creating a DistanceFinder class that preprocesses the words by mapping each word to its indices of occurrence within the list using a defaultdict from the collections module.
  • Storing all occurrences of each word in a dictionary (named word_indices) during the initialization of the class. This process is achieved through enumeration over the list of words.
  • Defining a method called find_minimum_distance that calculates the smallest distance between the positions of any two given words (word1 and word2). This method retrieves the lists of indices for both words and computes the minimal difference using two-pointer technique:
    • Maintain two pointers, each starting at zero, iterating over the lists of indices for both words.
    • Calculate differences between the indices pointed by both pointers, updating a minimum_distance variable with the smallest value found.
    • Move the pointer pointing to the smaller index to ensure a comparison of all possible index combinations, thereby finding the least distance.

This setup efficiently handles multiple queries for word distances after an initial setup, making it particularly useful when the list is large or when numerous distance queries need to be executed after the list is processed once. The primary advantage of this method lies in its preprocessing step, which optimizes subsequent distance calculations by avoiding repeated searches for each query.

Comments

No comments yet.