
Problem Statement
The task is to find the length of the longest substring within a given string s that contains no more than two distinct characters. A substring is defined as a contiguous sequence of characters within a string. This problem involves identifying such subsequences where the diversity of characters is limited to two distinct types.
Examples
Example 1
Input:
s = "eceba"
Output:
3
Explanation:
The substring is "ece" which its length is 3.
Example 2
Input:
s = "ccaabbb"
Output:
5
Explanation:
The substring is "aabbb" which its length is 5.
Constraints
1 <= s.length <= 105sconsists of English letters.
Approach and Intuition
To solve this problem of finding the longest substring with at most two distinct characters, one efficient way is through the use of the sliding window technique combined with a hash map to track the distinct characters within the window. Here is a step-by-step breakdown of the approach:
Initialize two pointers or indices, namely
startandendboth set to the beginning of the string. These will represent the current window of examination.Utilize a hash map (
char_count) to track the frequency of each character within the current window defined bystartandend.Iterate over the string by expanding the
endpointer. Add each character encountered into the hash map and update its count.Whenever the size of the hash map exceeds two (which means the window now contains more than two distinct characters), increment the
startpointer to shrink the window from the left. During this process, reduce the count of the character at thestartposition in the hash map, and if its count drops to zero, remove the character from the hash map.Throughout the iteration, maintain a variable to store the maximum size of the window when the window contains at most two distinct characters.
Example Walk-throughs
Example 1:
- Input:
s = "eceba" - Initializing
startandendto 0, the substrings grow as "e", "ec", "ece". At point "ece", whenendis at index 2, we have 2 distinct characters. - Moving
endto index 3 introduces 'b', making it three distinct characters. We therefore movestartto index 1 which makes the substring "ceb". Hence, then trimming more to "ce" or "eb", and proceeding further allows substrings like "eba". The longest valid substring is "ece" with length 3.
- Input:
Example 2:
- Input:
s = "ccaabbb" - Start with the window "cc", then "cca", and expanding to "ccaa". At "ccaab", the window surpasses two distinct characters limit.
- Adjusting
startto drop excess characters, we then grow to get "aab" and later "aabbb". Finally, "aabbb" is the longest valid substring with length 5.
- Input:
By applying a direct examination of each potential window and efficiently tracking character counts, this method provides a clear and robust solution to the problem.
Solutions
- C++
- Java
- C
- JavaScript
- Python
class Solution {
public:
int longestDistinctTwoCharSubstring(string s) {
size_t len = s.size();
if (len < 3) {
return len;
}
int start = 0, end = 0;
map<char, int> positionMap;
int longest = 2;
while (end < len) {
positionMap[s[end]] = end;
end++;
if (positionMap.size() == 3) {
int deleteIndex = INT_MAX;
for (auto &entry : positionMap) {
deleteIndex = min(deleteIndex, entry.second);
}
positionMap.erase(s[deleteIndex]);
start = deleteIndex + 1;
}
longest = max(longest, end - start);
}
return longest;
}
};
The given C++ code provides a solution to find the length of the longest substring that contains at most two distinct characters. Here’s a breakdown of how the code efficiently accomplishes this:
- The function
longestDistinctTwoCharSubstringaccepts a stringsas input. - After initializing required variables and a map to track character positions in the substring, the code dynamically expands and contracts the substring within a single pass using two pointers strategy -
startandend. - The variable
endis incremented to expand the substring and the characters and their latest positions are stored inpositionMap. - If the map grows to contain three distinct characters, the program identifies the character with the least recent position (the smallest index) and removes it from the map. The
startpointer is then adjusted to the next index following the removed character, effectively contracting the substring. - Throughout this process, it continually updates the
longestvariable to maintain the length of the longest valid substring found. - This loop continues until the end of the string is reached, ensuring all possible substrings are considered.
The approach balances efficient substring expansion and contraction with dynamic tracking of character positions, ensuring optimal performance with a complexity primarily driven by the string length. This method is particularly efficient for cases where the string is significantly longer than the alphabet size used.
class Solution {
public int maxSubstringLengthTwoDistinct(String str) {
int strLen = str.length();
if (strLen < 3) return strLen;
int start = 0;
int end = 0;
HashMap<Character, Integer> charPositionMap = new HashMap<>();
int maxLength = 2;
while (end < strLen) {
charPositionMap.put(str.charAt(end), end++);
if (charPositionMap.size() == 3) {
int minIndex = Collections.min(charPositionMap.values());
charPositionMap.remove(str.charAt(minIndex));
start = minIndex + 1;
}
maxLength = Math.max(maxLength, end - start);
}
return maxLength;
}
}
This solution in Java addresses the problem of finding the length of the longest substring that contains at most two distinct characters.
First, check if the length of the input string is less than three. If so, return the string length directly, as it naturally contains fewer than or equal to two distinct characters.
Initialize two pointers,
startandend, to manage the sliding window within the string, as well as aHashMapcalledcharPositionMapto store the most recent positions of characters.Set an initial
maxLengthto 2, as the minimum length of substring when the window has two distinct characters.Use a loop to iterate through the string. For each character at the position
end, store its position incharPositionMapand incrementend.When
charPositionMaphas three distinct characters, identify the character with the minimum index in the map and remove it to maintain only two distinct characters in the map. Adjust thestartpointer just past the removed character's position.Continuously update
maxLengthwith the larger value between the currentmaxLengthand the length of the current valid substring (fromstarttoend).Finally, return the computed
maxLength.
This approach efficiently manages the search space using a sliding window technique and a map to track character positions, ensuring that the substring always contains two or fewer distinct characters.
struct storage {
char keyId;
int value;
UT_hash_handle hhRef;
};
int findMaxLengthTwoDistinctCharacters(char *str) {
size_t length = strlen(str);
if (length < 3) {
return length;
}
int start = 0, end = 0;
struct storage *storageMap = NULL, *item, *tempItem;
int maxLength = 2;
while (end < length) {
HASH_FIND(hhRef, storageMap, &str[end], sizeof(char), item);
if (item != NULL) {
HASH_DEL(storageMap, item);
free(item);
}
item = (struct storage *)malloc(sizeof(*item));
if (item == NULL) return -1;
item->keyId = str[end];
item->value = end;
HASH_ADD_KEYPTR(hhRef, storageMap, &item->keyId, sizeof(char), item);
end++;
if (HASH_COUNT(storageMap) == 3) {
int minIdx = INT_MAX;
HASH_ITER(hhRef, storageMap, item, tempItem) {
minIdx = minIdx < item->value ? minIdx : item->value;
}
HASH_FIND(hhRef, storageMap, &str[minIdx], sizeof(char), item);
if (item != NULL) {
HASH_DEL(storageMap, item);
free(item);
}
start = minIdx + 1;
}
maxLength = maxLength > (end - start) ? maxLength : (end - start);
}
return maxLength;
}
This solution addresses the problem of finding the length of the longest substring with at most two distinct characters from a given string. The implementation, written in C, uses a combination of a hashing technique and a sliding window approach.
- A
struct storageis defined to store each character (keyId) and its latest position (value) in the string.UT_hash_handleis utilized to enable hashing of the struct. - The function
findMaxLengthTwoDistinctCharactersreceives a pointer to the character string. It initially checks if the string length is less than 3, immediately returning the string's length since the result is trivial in such cases. - A sliding window is utilized with two pointers,
startandend, initially at the start of the string.storageMapis introduced to keep track of up to two unique characters and their last occurrences. - As
endincreases, it examines each character until the end of the string:- If the character already exists in
storageMap, its entry gets updated. - For each new character, it creates a new
struct storage, assigns the character and its current index, and adds it to thestorageMap. - The sliding window adjusts when
storageMapexceeds two distinct characters by removing the character with the smallest index and moving thestartpointer forward.
- If the character already exists in
- The length of the current valid substring (
end - start) is compared againstmaxLengthto keep track of the maximum found so far. - When the traversal is complete, the function returns
maxLength, representing the length of the longest substring with at most two distinct characters.
This implementation effectively manages the sliding window and character index with dynamic memory management and hashing structures, providing an efficient solution to the problem.
var maxSubstringTwoDistinctChars = function (str) {
let len = str.length;
if (len < 3) return len;
let start = 0;
let end = 0;
let charPositions = {};
let max_length = 2;
while (end < len) {
charPositions[str[end]] = end++;
if (Object.keys(charPositions).length === 3) {
let minIndex = Math.min(...Object.values(charPositions));
delete charPositions[str[minIndex]];
start = minIndex + 1;
}
max_length = Math.max(max_length, end - start);
}
return max_length;
};
This JavaScript solution addresses the problem of finding the longest substring that contains at most two distinct characters in a given string.
The function maxSubstringTwoDistinctChars starts by checking if the string length is less than three. If it is, the function directly returns the string length, as any string shorter than three inevitably meets the criteria.
The core logic involves two pointers, start and end, which initially point to the beginning of the string. The charPositions object keeps track of the last occurrence index of each character encountered as end traverses through the string.
- Initialize
max_lengthto 2, since the minimum meaningful output when two distinct characters are present is a substring of length 2. - Use a while loop to iterate through the string. For each character at position
end, update its latest occurrence incharPositionsand incrementend. - Check if the number of distinct characters, determined by the key count in
charPositions, exceeds two. If it does, calculate the minimum index from the current character positions to find the oldest character position, remove this character fromcharPositions, and updatestartto focus only on the substring from the next index of this oldest character. - Continuously update
max_lengthwith the maximum between its current value and the length of the current substring defined bystarttoend.
At the end of the iteration, max_length contains the length of the longest substring composed of at most two distinct characters. This is returned as the final result, presenting an efficient approach with a time complexity of O(n), where n is the length of the input string.
from collections import defaultdict
class Solution:
def longestSubstringTwoDistinctChars(self, s: str) -> int:
length = len(s)
if length < 3:
return length
# Initialize window pointers and character map
start, end = 0, 0
char_map = defaultdict()
longest = 2
while end < length:
# Update the current character's last seen index
char_map[s[end]] = end
end += 1
# If the map contains 3 distinct characters, shrink the window
if len(char_map) == 3:
# Find the index of the oldest character and remove it
oldest = min(char_map.values())
del char_map[s[oldest]]
# Adjust the start of the window
start = oldest + 1
# Calculate the maximum length encountered
longest = max(longest, end - start)
return longest
The Python program provided tackles the challenge of finding the length of the longest substring that contains at most two distinct characters from a given string s. Follow these steps to understand the process:
- Check if the length of string
sis less than 3. Return this length as any substring will have at most two distinct characters in this case. - Initialize two pointers,
startandend, to manage the sliding window and a dictionarychar_mapto track the characters and their indices within the window. - Set the initial value of
longestto 2, which will store the maximum length of substrings found that meet the criteria. - Utilize a while loop to iterate as long as
endis less than the string length. Within the loop:- Map the current character at position
endto its index. - Increment the
endpointer to slide the window to the right. - If
char_mapcontains three distinct characters, identify the index of the earliest occurring character (oldest) and remove it from thechar_map. Then, adjust thestartpointer to the position right after this oldest character. - Update
longestwith the larger value betweenlongestand the current window size (end - start).
- Map the current character at position
- Finally, return the value stored in
longestas the length of the longest substring with at most two distinct characters.
This solution effectively manages the sliding window and utilizes a dictionary to keep track of character occurrences, optimizing the process of finding the correct substring length.