Minimum Window Substring

Updated on 17 June, 2025
Minimum Window Substring header image

Problem Statement

This problem requires finding the smallest possible substring within a given string s, such that this substring contains all the characters of another string t, including their duplicates. The lengths of the strings s and t are denoted as m and n respectively. The function should return the shortest substring which includes all characters of t. If no such substring exists that satisfies these conditions, the function should return an empty string "". The provided solution needs to be unique for the given inputs, ensuring that t's characters are accounted for in the final substring result.

Examples

Example 1

Input:

s = "ADOBECODEBANC", t = "ABC"

Output:

"BANC"

Explanation:

The minimum window substring "BANC" includes 'A', 'B', and 'C' from string t.

Example 2

Input:

s = "a", t = "a"

Output:

"a"

Explanation:

The entire string s is the minimum window.

Example 3

Input:

s = "a", t = "aa"

Output:

""

Explanation:

Both 'a's from t must be included in the window.
Since the largest window of s only has one 'a', return empty string.

Constraints

  • m == s.length
  • n == t.length
  • 1 <= m, n <= 105
  • s and t consist of uppercase and lowercase English letters.

Approach and Intuition

To address this problem, the approach revolves around the concept of a sliding window and hashing:

  1. Initialize Hash: Start by creating a hash map or count of all characters in t. This will help keep track of which characters (including duplicates) still need to be included in the current substring window of s.

  2. Sliding Window Technique: Utilize a sliding window represented by two pointers initially pointing to the start of s. Expand and contract this window by moving the pointers based on the presence and sufficiency of the required characters:

    • Expand the right pointer to enlarge the window until it contains all the necessary characters from t.
    • Once all characters are included, attempt to contract the window from the left to minimize its size without losing all required characters.
  3. Update Minimum Window: Each time a valid window containing all of t's characters is found, compare it with the previously stored minimum window size and update if the current one is smaller.

  4. Boundary Conditions: Throughout the process, ensure conditions like non-existent characters from t in s, cases where s is smaller than t, and other edge scenarios are handled gracefully.

  5. Final Result: After evaluating all potential windows, the smallest window that contains every character of t is returned. If no such window exists, return "".

Case Studies from Examples:

  • For s = "ADOBECODEBANC", t = "ABC", the smallest window satisfying all contains of t is "BANC".
  • For the single character strings with s = "a", t = "a", the entire string s itself is the minimum window.
  • In cases where t has more occurrences of a particular character than s, such as s = "a", t = "aa", it's impossible to get a valid window, and an empty string is returned.

Solutions

  • C++
  • Java
  • C
  • JavaScript
  • Python
cpp
class Solution {
public:
    string minimumWindowSubstring(string s, string t) {
        struct CharIndex {
            int pos;
            char value;
        };
        if (s.empty() || t.empty()) {
            return "";
        }
        unordered_map<char, int> targetFreq;
        for (char c : t) {
            targetFreq[c]++;
        }
        int needed = targetFreq.size();
        vector<CharIndex> relevantChars;
        for (int i = 0; i < s.length(); i++) {
            if (targetFreq.count(s[i])) relevantChars.push_back({i, s[i]});
        }
        int left = 0, right = 0, formed = 0;
        unordered_map<char, int> countWindow;
        vector<int> result = {-1, 0, 0};
        while (right < relevantChars.size()) {
            char charRight = relevantChars[right].value;
            countWindow[charRight]++;
            if (targetFreq.count(charRight) && countWindow[charRight] == targetFreq[charRight]) {
                formed++;
            }
            while (left <= right && formed == needed) {
                char charLeft = relevantChars[left].value;
                int endPos = relevantChars[right].pos;
                int startPos = relevantChars[left].pos;
                if (result[0] == -1 || endPos - startPos + 1 < result[0]) {
                    result[0] = endPos - startPos + 1;
                    result[1] = startPos;
                    result[2] = endPos;
                }
                countWindow[charLeft]--;
                if (targetFreq.count(charLeft) && countWindow[charLeft] < targetFreq[charLeft]) {
                    formed--;
                }
                left++;
            }
            right++;
        }
        return result[0] == -1 ? "" : s.substr(result[1], result[0]);
    }
};

The provided C++ program aims to solve the problem of finding the minimum window substring. This solution entails finding the smallest substring within a given string s that contains all the characters of another string t. Below is a detailed breakdown of the implemented algorithm:

  • Initialize frequency counts for characters of string t using an unordered map, targetFreq.
  • Track the number of unique characters in t that need to be present in the substring.
  • Filter out positions in s that contain characters relevant to t and store their indices and characters in relevantChars.
  • Use two pointers, left and right, initialized at the start of relevantChars to explore potential windows that can contain all the characters from t.
  • Utilize another unordered map, countWindow, to maintain the count of characters in the current window.
  • Use a vector, result, to store the size of the current smallest window and its start and end positions in s.
  • Expand the window by moving right pointer and include characters in countWindow.
  • Once a valid window containing all required characters is found (checked by formed variable), attempt to reduce its size by sliding the left pointer to the right.
  • For every valid window, compare its size with the smallest found so far and update result accordingly.
  • After exploring all potential windows, extract and return the smallest window from s using the indices recorded in result.

This approach ensures that character positions relevant to t are the main focus, facilitating efficient window size adjustments and checks, thereby enhancing the overall performance of the substring search.

java
class Solution {
    public String minimumWindowSubstring(String str, String target) {
        if (str.isEmpty() || target.isEmpty()) {
            return "";
        }
    
        Map<Character, Integer> targetFrequency = new HashMap<>();
    
        for (int i = 0; i < target.length(); i++) {
            int freq = targetFrequency.getOrDefault(target.charAt(i), 0);
            targetFrequency.put(target.charAt(i), freq + 1);
        }
    
        int requiredChars = targetFrequency.size();
    
        List<Pair<Integer, Character>> filteredStr = new ArrayList<>();
        for (int i = 0; i < str.length(); i++) {
            char currentChar = str.charAt(i);
            if (targetFrequency.containsKey(currentChar)) {
                filteredStr.add(new Pair<Integer, Character>(i, currentChar));
            }
        }
    
        int left = 0, right = 0, formed = 0;
        Map<Character, Integer> windowCharacters = new HashMap<>();
        int[] result = {-1, 0, 0};
    
        while (right < filteredStr.size()) {
            char character = filteredStr.get(right).getValue();
            int currentCount = windowCharacters.getOrDefault(character, 0);
            windowCharacters.put(character, currentCount + 1);
    
            if (targetFrequency.containsKey(character) &&
                windowCharacters.get(character).intValue() == targetFrequency.get(character).intValue()) {
                formed++;
            }
    
            while (left <= right && formed == requiredChars) {
                character = filteredStr.get(left).getValue();
    
                int endIndex = filteredStr.get(right).getKey();
                int startIndex = filteredStr.get(left).getKey();
                if (result[0] == -1 || endIndex - startIndex + 1 < result[0]) {
                    result[0] = endIndex - startIndex + 1;
                    result[1] = startIndex;
                    result[2] = endIndex;
                }
    
                windowCharacters.put(character, windowCharacters.get(character) - 1);
                if (targetFrequency.containsKey(character) &&
                    windowCharacters.get(character).intValue() < targetFrequency.get(character).intValue()) {
                    formed--;
                }
                left++;
            }
            right++;
        }
        return result[0] == -1 ? "" : str.substring(result[1], result[2] + 1);
    }
}

The Java solution implements a method to find the smallest substring in a given string str that contains all the characters of another string target. Here's a breakdown of how the solution works:

  • Initialize frequency maps to store the count of each character required from target and the characters within the current window in str.
  • Filter out characters in str that are not part of target to optimize the search.
  • Use a sliding window approach with two pointers, left and right, to explore possible substrings that cover all required characters.
  • Expand the window by moving right pointer and adjust the window size by moving left pointer when the window meets the requirement (contains all target characters) to potentially find a smaller valid window.
  • Maintain the character count of the current window and check against target's required count. Adjust counters and a formation tracker as characters enter or leave the window.
  • Record the smallest window found during this operation using a result array maintaining the length and indices of this window.
  • Return the smallest valid substring, or an empty string if no valid substring exists.

This approach ensures optimal exploration of str, using character mapping and filtering to optimize performance by focusing only on relevant characters and adjusting window size dynamically. The technique effectively handles cases with large input strings or when target characters are scattered widely across str.

c
struct Position {
    int idx;
    char value;
};
    
char* smallestWindowSubstring(char* S, char* T) {
    if (strlen(S) == 0 || strlen(T) == 0) {
        return "";
    }
    struct FrequencyMap {
        char character;
        int amount;
        UT_hash_handle hh;
    };
        
    struct FrequencyMap *targetFreq = NULL, *element, *temporary, *currentWindow = NULL;
        
    for (int i = 0; i < strlen(T); i++) {
        HASH_FIND(hh, targetFreq, &T[i], sizeof(char), temporary);
        if (temporary) {
            temporary->amount++;
        } else {
            temporary = (struct FrequencyMap*) malloc(sizeof *temporary);
            temporary->character = T[i];
            temporary->amount = 1;
            HASH_ADD(hh, targetFreq, character, sizeof(char), temporary);
        }
    }
        
    int requiredMatches = HASH_COUNT(targetFreq);
    struct Position* indices =
        (struct Position*)malloc(strlen(S) * sizeof(struct Position));
        
    int count = 0;
    for (int i = 0; i < strlen(S); i++) {
        HASH_FIND(hh, targetFreq, &(S[i]), sizeof(char), temporary);
        if (temporary) {
            indices[count].idx = i;
            indices[count].value = S[i];
            count++;
        }
    }
    
    int left = 0, right = 0, formed = 0, result[3] = {-1, 0, 0};
        
    while (right < count) {
        char c = indices[right].value;
        HASH_FIND(hh, currentWindow, &c, sizeof(char), temporary);
        if (temporary) {
            temporary->amount++;
        } else {
            temporary = (struct FrequencyMap*) malloc(sizeof *temporary);
            temporary->character = c;
            temporary->amount = 1;
            HASH_ADD(hh, currentWindow, character, sizeof(char), temporary);
        }
        HASH_FIND(hh, targetFreq, &c, sizeof(char), element);
        if (element && temporary->amount == element->amount) {
            formed++;
        }
            
        while (left <= right && formed == requiredMatches) {
            c = indices[left].value;
            int end = indices[right].idx;
            int start = indices[left].idx;
            if (result[0] == -1 || end - start + 1 < result[0]) {
                result[0] = end - start + 1;
                result[1] = start;
                result[2] = end;
            }
            HASH_FIND(hh, currentWindow, &c, sizeof(char), temporary);
            if (temporary) {
                temporary->amount--;
            }
            HASH_FIND(hh, targetFreq, &c, sizeof(char), element);
            if (element && temporary->amount < element->amount) {
                formed--;
            }
            left++;
        }
        right++;
    }
    return result[0] == -1 ? "" : strndup(S + result[1], result[0]);
}

The given C program is designed to solve the problem of finding the smallest substring in string S that includes all the characters of string T. Here’s how the solution is structured and implemented:

  • Define a Position structure to store indices and values of characters from S that are found in T.
  • Utilize a FrequencyMap structure with UTHash to efficiently count and access character frequencies for both strings S and T.
  • First, generate a hashmap (targetFreq) for the frequency of each character in T.
  • Create an array (indices) of Position for storing relevant character positions from S that are in T.
  • Two pointers technique (left and right) is used alongside a sliding window approach to explore potential minimal substrings in S that cover all characters in T with necessary counts.
  • Maintain a dynamic hashmap (currentWindow) while adjusting the window with the left and right pointers, tracking the number of valid characters (formed) that match the required frequency in targetFreq.
  • Upon finding a valid window where the number of formed characters equals required unique characters in T, attempt to update the smallest window size.
  • Once the smallest valid window is determined (if any), extract and return the substring from S.

This code effectively uses advanced data structures and algorithms like hashing and sliding window, which are well-suited for optimization problems involving substring searches and frequency management, leading to a solution with potentially lower time complexity compared to naive approaches.

js
function shortestSubsequence(source, target) {
    if (source.length == 0 || target.length == 0) {
        return "";
    }
    let targetFreq = {};
    for (let i = 0; i < target.length; i++) {
        let charCount = targetFreq.hasOwnProperty(target[i]) ? targetFreq[target[i]] : 0;
        targetFreq[target[i]] = charCount + 1;
    }
    let needed = Object.keys(targetFreq).length;
    let validS = [];
    for (let i = 0; i < source.length; i++) {
        let char = source[i];
        if (targetFreq.hasOwnProperty(char)) {
            validS.push([i, char]);
        }
    }
    let left = 0,
        right = 0,
        valid = 0;
    let freqS = {};
    let result = [-1, 0, 0];
    while (right < validS.length) {
        let char = validS[right][1];
        let count = freqS.hasOwnProperty(char) ? freqS[char] : 0;
        freqS[char] = count + 1;
        if (targetFreq.hasOwnProperty(char) && freqS[char] == targetFreq[char]) {
            valid++;
        }
        while (left <= right && valid == needed) {
            char = validS[left][1];
            let end = validS[right][0];
            let start = validS[left][0];
            if (result[0] == -1 || end - start + 1 < result[0]) {
                result[0] = end - start + 1;
                result[1] = start;
                result[2] = end;
            }
            freqS[char] = freqS[char] - 1;
            if (targetFreq.hasOwnProperty(char) && freqS[char] < targetFreq[char]) {
                valid--;
            }
            left++;
        }
        right++;
    }
    return result[0] == -1 ? "" : source.substring(result[1], result[2] + 1);
}

The JavaScript function shortestSubsequence aims to find the smallest substring in a given string source that contains all the characters of another string target. The process is detailed below:

  1. First, confirm that both source and target are non-empty strings. If any is empty, the function returns an empty string immediately.

  2. Construct a frequency map targetFreq for the characters in target, capturing how many times each character appears.

  3. Identify all valid positions in source where characters from target exist. Store these as pairs of index and character in array validS.

  4. Initiate two pointers (left and right), a counter valid (to count the number of target characters correctly matched within the window), and frequency map freqS for the substring window in source.

  5. As right pointer traverses validS, update freqS to reflect the counts of characters in the current window. Adjust valid each time a target character's count in the window matches its required count in targetFreq.

  6. Shift the left pointer to the right whenever the current window has all the necessary target characters (valid equals the length of targetFreq). During this shift, recalculate window dimensions (starting and ending indices) to potentially find a smaller valid window. Also modify freqS to decrease the count of character being removed from the window.

  7. If at least one valid window is found during the traversal, return the smallest such window. If no such window exists, return an empty string.

This function efficiently figures out the smallest window by using the sliding window technique and character count maps, ensuring that the substring contains all characters of the target in minimal length. This approach reduces unnecessary comparisons and efficiently narrows down the smallest substring containing all required characters.

python
class Solution:
    def shortestSubstring(self, source: str, target: str) -> str:
        if not target or not source:
            return ""
    
        target_count = Counter(target)
        required = len(target_count)
    
        indexed_source = []
        for index, character in enumerate(source):
            if character in target_count:
                indexed_source.append((index, character))
    
        left, right = 0, 0
        formed = 0
        current_counts = {}
    
        result = float("inf"), None, None
    
        while right < len(indexed_source):
            char = indexed_source[right][1]
            current_counts[char] = current_counts.get(char, 0) + 1
    
            if current_counts[char] == target_count[char]:
                formed += 1
    
            while left <= right and formed == required:
                char = indexed_source[left][1]
    
                end_index = indexed_source[right][0]
                start_index = indexed_source[left][0]
                if end_index - start_index + 1 < result[0]:
                    result = (end_index - start_index + 1, start_index, end_index)
    
                current_counts[char] -= 1
                if current_counts[char] < target_count[char]:
                    formed -= 1
                left += 1
    
            right += 1
    
        return "" if result[0] == float("inf") else source[result[1]: result[2] + 1]

This code snippet addresses the problem of finding the smallest substring in a given string (source) that contains all the characters of another string (target). The method employed here effectively uses a sliding window technique paired with character frequency counting.

  • Initialize a check for empty target or source, returning an empty string if either is true.
  • Create a frequency counter for the target to account for each character's occurrence.
  • Filter and index characters from the source that are present in the target.
  • Set up two pointers (left and right) to represent the current window's boundaries and use these to adjust the window size dynamically.
  • Use a dictionary to track the count of target characters within the current window (current_counts) and an integer to count when the required characters' correct frequencies are met (formed).
  • Iterate with the right pointer through the source, expanding the window. For each character:
    • Include the character in current_counts.
    • Once the frequency of a character matches its requirement in target_count, increment the formed count.
    • When the entire target requirement is met (formed equals the number of unique characters in target), attempt to minimize the window size from the left.
    • Track the smallest valid window by updating a result placeholder with the smallest size found along with the start and end indices.
    • Continue until you've processed all eligible characters from the source.
  • Finally, extract and return the smallest window substring from the source using the indices stored in result. If no valid window is found, return an empty string.

This solution is efficient in terms of time complexity, suitable for large inputs, and adheres closely to optimal string manipulation techniques through its compact and judicious use of space.

Comments

No comments yet.