
Problem Statement
This problem requires finding the smallest possible substring within a given string s, such that this substring contains all the characters of another string t, including their duplicates. The lengths of the strings s and t are denoted as m and n respectively. The function should return the shortest substring which includes all characters of t. If no such substring exists that satisfies these conditions, the function should return an empty string "". The provided solution needs to be unique for the given inputs, ensuring that t's characters are accounted for in the final substring result.
Examples
Example 1
Input:
s = "ADOBECODEBANC", t = "ABC"
Output:
"BANC"
Explanation:
The minimum window substring "BANC" includes 'A', 'B', and 'C' from string t.
Example 2
Input:
s = "a", t = "a"
Output:
"a"
Explanation:
The entire string s is the minimum window.
Example 3
Input:
s = "a", t = "aa"
Output:
""
Explanation:
Both 'a's from t must be included in the window. Since the largest window of s only has one 'a', return empty string.
Constraints
m == s.lengthn == t.length1 <= m, n <= 105sandtconsist of uppercase and lowercase English letters.
Approach and Intuition
To address this problem, the approach revolves around the concept of a sliding window and hashing:
Initialize Hash: Start by creating a hash map or count of all characters in
t. This will help keep track of which characters (including duplicates) still need to be included in the current substring window ofs.Sliding Window Technique: Utilize a sliding window represented by two pointers initially pointing to the start of
s. Expand and contract this window by moving the pointers based on the presence and sufficiency of the required characters:- Expand the right pointer to enlarge the window until it contains all the necessary characters from
t. - Once all characters are included, attempt to contract the window from the left to minimize its size without losing all required characters.
- Expand the right pointer to enlarge the window until it contains all the necessary characters from
Update Minimum Window: Each time a valid window containing all of
t's characters is found, compare it with the previously stored minimum window size and update if the current one is smaller.Boundary Conditions: Throughout the process, ensure conditions like non-existent characters from
tins, cases wheresis smaller thant, and other edge scenarios are handled gracefully.Final Result: After evaluating all potential windows, the smallest window that contains every character of
tis returned. If no such window exists, return "".
Case Studies from Examples:
- For
s = "ADOBECODEBANC",t = "ABC", the smallest window satisfying all contains oftis "BANC". - For the single character strings with
s = "a",t = "a", the entire stringsitself is the minimum window. - In cases where
thas more occurrences of a particular character thans, such ass = "a",t = "aa", it's impossible to get a valid window, and an empty string is returned.
Solutions
- C++
- Java
- C
- JavaScript
- Python
class Solution {
public:
string minimumWindowSubstring(string s, string t) {
struct CharIndex {
int pos;
char value;
};
if (s.empty() || t.empty()) {
return "";
}
unordered_map<char, int> targetFreq;
for (char c : t) {
targetFreq[c]++;
}
int needed = targetFreq.size();
vector<CharIndex> relevantChars;
for (int i = 0; i < s.length(); i++) {
if (targetFreq.count(s[i])) relevantChars.push_back({i, s[i]});
}
int left = 0, right = 0, formed = 0;
unordered_map<char, int> countWindow;
vector<int> result = {-1, 0, 0};
while (right < relevantChars.size()) {
char charRight = relevantChars[right].value;
countWindow[charRight]++;
if (targetFreq.count(charRight) && countWindow[charRight] == targetFreq[charRight]) {
formed++;
}
while (left <= right && formed == needed) {
char charLeft = relevantChars[left].value;
int endPos = relevantChars[right].pos;
int startPos = relevantChars[left].pos;
if (result[0] == -1 || endPos - startPos + 1 < result[0]) {
result[0] = endPos - startPos + 1;
result[1] = startPos;
result[2] = endPos;
}
countWindow[charLeft]--;
if (targetFreq.count(charLeft) && countWindow[charLeft] < targetFreq[charLeft]) {
formed--;
}
left++;
}
right++;
}
return result[0] == -1 ? "" : s.substr(result[1], result[0]);
}
};
The provided C++ program aims to solve the problem of finding the minimum window substring. This solution entails finding the smallest substring within a given string s that contains all the characters of another string t. Below is a detailed breakdown of the implemented algorithm:
- Initialize frequency counts for characters of string
tusing an unordered map,targetFreq. - Track the number of unique characters in
tthat need to be present in the substring. - Filter out positions in
sthat contain characters relevant totand store their indices and characters inrelevantChars. - Use two pointers,
leftandright, initialized at the start ofrelevantCharsto explore potential windows that can contain all the characters fromt. - Utilize another unordered map,
countWindow, to maintain the count of characters in the current window. - Use a vector,
result, to store the size of the current smallest window and its start and end positions ins. - Expand the window by moving
rightpointer and include characters incountWindow. - Once a valid window containing all required characters is found (checked by
formedvariable), attempt to reduce its size by sliding theleftpointer to the right. - For every valid window, compare its size with the smallest found so far and update
resultaccordingly. - After exploring all potential windows, extract and return the smallest window from
susing the indices recorded inresult.
This approach ensures that character positions relevant to t are the main focus, facilitating efficient window size adjustments and checks, thereby enhancing the overall performance of the substring search.
class Solution {
public String minimumWindowSubstring(String str, String target) {
if (str.isEmpty() || target.isEmpty()) {
return "";
}
Map<Character, Integer> targetFrequency = new HashMap<>();
for (int i = 0; i < target.length(); i++) {
int freq = targetFrequency.getOrDefault(target.charAt(i), 0);
targetFrequency.put(target.charAt(i), freq + 1);
}
int requiredChars = targetFrequency.size();
List<Pair<Integer, Character>> filteredStr = new ArrayList<>();
for (int i = 0; i < str.length(); i++) {
char currentChar = str.charAt(i);
if (targetFrequency.containsKey(currentChar)) {
filteredStr.add(new Pair<Integer, Character>(i, currentChar));
}
}
int left = 0, right = 0, formed = 0;
Map<Character, Integer> windowCharacters = new HashMap<>();
int[] result = {-1, 0, 0};
while (right < filteredStr.size()) {
char character = filteredStr.get(right).getValue();
int currentCount = windowCharacters.getOrDefault(character, 0);
windowCharacters.put(character, currentCount + 1);
if (targetFrequency.containsKey(character) &&
windowCharacters.get(character).intValue() == targetFrequency.get(character).intValue()) {
formed++;
}
while (left <= right && formed == requiredChars) {
character = filteredStr.get(left).getValue();
int endIndex = filteredStr.get(right).getKey();
int startIndex = filteredStr.get(left).getKey();
if (result[0] == -1 || endIndex - startIndex + 1 < result[0]) {
result[0] = endIndex - startIndex + 1;
result[1] = startIndex;
result[2] = endIndex;
}
windowCharacters.put(character, windowCharacters.get(character) - 1);
if (targetFrequency.containsKey(character) &&
windowCharacters.get(character).intValue() < targetFrequency.get(character).intValue()) {
formed--;
}
left++;
}
right++;
}
return result[0] == -1 ? "" : str.substring(result[1], result[2] + 1);
}
}
The Java solution implements a method to find the smallest substring in a given string str that contains all the characters of another string target. Here's a breakdown of how the solution works:
- Initialize frequency maps to store the count of each character required from
targetand the characters within the current window instr. - Filter out characters in
strthat are not part oftargetto optimize the search. - Use a sliding window approach with two pointers,
leftandright, to explore possible substrings that cover all required characters. - Expand the window by moving
rightpointer and adjust the window size by movingleftpointer when the window meets the requirement (contains alltargetcharacters) to potentially find a smaller valid window. - Maintain the character count of the current window and check against
target's required count. Adjust counters and a formation tracker as characters enter or leave the window. - Record the smallest window found during this operation using a
resultarray maintaining the length and indices of this window. - Return the smallest valid substring, or an empty string if no valid substring exists.
This approach ensures optimal exploration of str, using character mapping and filtering to optimize performance by focusing only on relevant characters and adjusting window size dynamically. The technique effectively handles cases with large input strings or when target characters are scattered widely across str.
struct Position {
int idx;
char value;
};
char* smallestWindowSubstring(char* S, char* T) {
if (strlen(S) == 0 || strlen(T) == 0) {
return "";
}
struct FrequencyMap {
char character;
int amount;
UT_hash_handle hh;
};
struct FrequencyMap *targetFreq = NULL, *element, *temporary, *currentWindow = NULL;
for (int i = 0; i < strlen(T); i++) {
HASH_FIND(hh, targetFreq, &T[i], sizeof(char), temporary);
if (temporary) {
temporary->amount++;
} else {
temporary = (struct FrequencyMap*) malloc(sizeof *temporary);
temporary->character = T[i];
temporary->amount = 1;
HASH_ADD(hh, targetFreq, character, sizeof(char), temporary);
}
}
int requiredMatches = HASH_COUNT(targetFreq);
struct Position* indices =
(struct Position*)malloc(strlen(S) * sizeof(struct Position));
int count = 0;
for (int i = 0; i < strlen(S); i++) {
HASH_FIND(hh, targetFreq, &(S[i]), sizeof(char), temporary);
if (temporary) {
indices[count].idx = i;
indices[count].value = S[i];
count++;
}
}
int left = 0, right = 0, formed = 0, result[3] = {-1, 0, 0};
while (right < count) {
char c = indices[right].value;
HASH_FIND(hh, currentWindow, &c, sizeof(char), temporary);
if (temporary) {
temporary->amount++;
} else {
temporary = (struct FrequencyMap*) malloc(sizeof *temporary);
temporary->character = c;
temporary->amount = 1;
HASH_ADD(hh, currentWindow, character, sizeof(char), temporary);
}
HASH_FIND(hh, targetFreq, &c, sizeof(char), element);
if (element && temporary->amount == element->amount) {
formed++;
}
while (left <= right && formed == requiredMatches) {
c = indices[left].value;
int end = indices[right].idx;
int start = indices[left].idx;
if (result[0] == -1 || end - start + 1 < result[0]) {
result[0] = end - start + 1;
result[1] = start;
result[2] = end;
}
HASH_FIND(hh, currentWindow, &c, sizeof(char), temporary);
if (temporary) {
temporary->amount--;
}
HASH_FIND(hh, targetFreq, &c, sizeof(char), element);
if (element && temporary->amount < element->amount) {
formed--;
}
left++;
}
right++;
}
return result[0] == -1 ? "" : strndup(S + result[1], result[0]);
}
The given C program is designed to solve the problem of finding the smallest substring in string S that includes all the characters of string T. Here’s how the solution is structured and implemented:
- Define a
Positionstructure to store indices and values of characters fromSthat are found inT. - Utilize a
FrequencyMapstructure withUTHashto efficiently count and access character frequencies for both stringsSandT. - First, generate a hashmap (
targetFreq) for the frequency of each character inT. - Create an array (
indices) ofPositionfor storing relevant character positions fromSthat are inT. - Two pointers technique (
leftandright) is used alongside a sliding window approach to explore potential minimal substrings inSthat cover all characters inTwith necessary counts. - Maintain a dynamic hashmap (
currentWindow) while adjusting the window with theleftandrightpointers, tracking the number of valid characters (formed) that match the required frequency intargetFreq. - Upon finding a valid window where the number of formed characters equals required unique characters in
T, attempt to update the smallest window size. - Once the smallest valid window is determined (if any), extract and return the substring from
S.
This code effectively uses advanced data structures and algorithms like hashing and sliding window, which are well-suited for optimization problems involving substring searches and frequency management, leading to a solution with potentially lower time complexity compared to naive approaches.
function shortestSubsequence(source, target) {
if (source.length == 0 || target.length == 0) {
return "";
}
let targetFreq = {};
for (let i = 0; i < target.length; i++) {
let charCount = targetFreq.hasOwnProperty(target[i]) ? targetFreq[target[i]] : 0;
targetFreq[target[i]] = charCount + 1;
}
let needed = Object.keys(targetFreq).length;
let validS = [];
for (let i = 0; i < source.length; i++) {
let char = source[i];
if (targetFreq.hasOwnProperty(char)) {
validS.push([i, char]);
}
}
let left = 0,
right = 0,
valid = 0;
let freqS = {};
let result = [-1, 0, 0];
while (right < validS.length) {
let char = validS[right][1];
let count = freqS.hasOwnProperty(char) ? freqS[char] : 0;
freqS[char] = count + 1;
if (targetFreq.hasOwnProperty(char) && freqS[char] == targetFreq[char]) {
valid++;
}
while (left <= right && valid == needed) {
char = validS[left][1];
let end = validS[right][0];
let start = validS[left][0];
if (result[0] == -1 || end - start + 1 < result[0]) {
result[0] = end - start + 1;
result[1] = start;
result[2] = end;
}
freqS[char] = freqS[char] - 1;
if (targetFreq.hasOwnProperty(char) && freqS[char] < targetFreq[char]) {
valid--;
}
left++;
}
right++;
}
return result[0] == -1 ? "" : source.substring(result[1], result[2] + 1);
}
The JavaScript function shortestSubsequence aims to find the smallest substring in a given string source that contains all the characters of another string target. The process is detailed below:
First, confirm that both
sourceandtargetare non-empty strings. If any is empty, the function returns an empty string immediately.Construct a frequency map
targetFreqfor the characters intarget, capturing how many times each character appears.Identify all valid positions in
sourcewhere characters fromtargetexist. Store these as pairs of index and character in arrayvalidS.Initiate two pointers (
leftandright), a countervalid(to count the number of target characters correctly matched within the window), and frequency mapfreqSfor the substring window insource.As
rightpointer traversesvalidS, updatefreqSto reflect the counts of characters in the current window. Adjustvalideach time a target character's count in the window matches its required count intargetFreq.Shift the
leftpointer to the right whenever the current window has all the necessary target characters (validequals the length oftargetFreq). During this shift, recalculate window dimensions (starting and ending indices) to potentially find a smaller valid window. Also modifyfreqSto decrease the count of character being removed from the window.If at least one valid window is found during the traversal, return the smallest such window. If no such window exists, return an empty string.
This function efficiently figures out the smallest window by using the sliding window technique and character count maps, ensuring that the substring contains all characters of the target in minimal length. This approach reduces unnecessary comparisons and efficiently narrows down the smallest substring containing all required characters.
class Solution:
def shortestSubstring(self, source: str, target: str) -> str:
if not target or not source:
return ""
target_count = Counter(target)
required = len(target_count)
indexed_source = []
for index, character in enumerate(source):
if character in target_count:
indexed_source.append((index, character))
left, right = 0, 0
formed = 0
current_counts = {}
result = float("inf"), None, None
while right < len(indexed_source):
char = indexed_source[right][1]
current_counts[char] = current_counts.get(char, 0) + 1
if current_counts[char] == target_count[char]:
formed += 1
while left <= right and formed == required:
char = indexed_source[left][1]
end_index = indexed_source[right][0]
start_index = indexed_source[left][0]
if end_index - start_index + 1 < result[0]:
result = (end_index - start_index + 1, start_index, end_index)
current_counts[char] -= 1
if current_counts[char] < target_count[char]:
formed -= 1
left += 1
right += 1
return "" if result[0] == float("inf") else source[result[1]: result[2] + 1]
This code snippet addresses the problem of finding the smallest substring in a given string (source) that contains all the characters of another string (target). The method employed here effectively uses a sliding window technique paired with character frequency counting.
- Initialize a check for empty
targetorsource, returning an empty string if either is true. - Create a frequency counter for the
targetto account for each character's occurrence. - Filter and index characters from the
sourcethat are present in thetarget. - Set up two pointers (
leftandright) to represent the current window's boundaries and use these to adjust the window size dynamically. - Use a dictionary to track the count of
targetcharacters within the current window (current_counts) and an integer to count when the required characters' correct frequencies are met (formed). - Iterate with the
rightpointer through thesource, expanding the window. For each character:- Include the character in
current_counts. - Once the frequency of a character matches its requirement in
target_count, increment theformedcount. - When the entire
targetrequirement is met (formedequals the number of unique characters intarget), attempt to minimize the window size from the left. - Track the smallest valid window by updating a result placeholder with the smallest size found along with the start and end indices.
- Continue until you've processed all eligible characters from the
source.
- Include the character in
- Finally, extract and return the smallest window substring from the
sourceusing the indices stored inresult. If no valid window is found, return an empty string.
This solution is efficient in terms of time complexity, suitable for large inputs, and adheres closely to optimal string manipulation techniques through its compact and judicious use of space.