
Problem Statement
The task is to find the length of the longest substring within a given string s
that contains no more than two distinct characters. A substring is defined as a contiguous sequence of characters within a string. This problem involves identifying such subsequences where the diversity of characters is limited to two distinct types.
Examples
Example 1
Input:
s = "eceba"
Output:
3
Explanation:
The substring is "ece" which its length is 3.
Example 2
Input:
s = "ccaabbb"
Output:
5
Explanation:
The substring is "aabbb" which its length is 5.
Constraints
1 <= s.length <= 105
s
consists of English letters.
Approach and Intuition
To solve this problem of finding the longest substring with at most two distinct characters, one efficient way is through the use of the sliding window technique combined with a hash map to track the distinct characters within the window. Here is a step-by-step breakdown of the approach:
Initialize two pointers or indices, namely
start
andend
both set to the beginning of the string. These will represent the current window of examination.Utilize a hash map (
char_count
) to track the frequency of each character within the current window defined bystart
andend
.Iterate over the string by expanding the
end
pointer. Add each character encountered into the hash map and update its count.Whenever the size of the hash map exceeds two (which means the window now contains more than two distinct characters), increment the
start
pointer to shrink the window from the left. During this process, reduce the count of the character at thestart
position in the hash map, and if its count drops to zero, remove the character from the hash map.Throughout the iteration, maintain a variable to store the maximum size of the window when the window contains at most two distinct characters.
Example Walk-throughs
Example 1:
- Input:
s = "eceba"
- Initializing
start
andend
to 0, the substrings grow as "e", "ec", "ece". At point "ece", whenend
is at index 2, we have 2 distinct characters. - Moving
end
to index 3 introduces 'b', making it three distinct characters. We therefore movestart
to index 1 which makes the substring "ceb". Hence, then trimming more to "ce" or "eb", and proceeding further allows substrings like "eba". The longest valid substring is "ece" with length 3.
- Input:
Example 2:
- Input:
s = "ccaabbb"
- Start with the window "cc", then "cca", and expanding to "ccaa". At "ccaab", the window surpasses two distinct characters limit.
- Adjusting
start
to drop excess characters, we then grow to get "aab" and later "aabbb". Finally, "aabbb" is the longest valid substring with length 5.
- Input:
By applying a direct examination of each potential window and efficiently tracking character counts, this method provides a clear and robust solution to the problem.
Solutions
- C++
- Java
- C
- JavaScript
- Python
class Solution {
public:
int longestDistinctTwoCharSubstring(string s) {
size_t len = s.size();
if (len < 3) {
return len;
}
int start = 0, end = 0;
map<char, int> positionMap;
int longest = 2;
while (end < len) {
positionMap[s[end]] = end;
end++;
if (positionMap.size() == 3) {
int deleteIndex = INT_MAX;
for (auto &entry : positionMap) {
deleteIndex = min(deleteIndex, entry.second);
}
positionMap.erase(s[deleteIndex]);
start = deleteIndex + 1;
}
longest = max(longest, end - start);
}
return longest;
}
};
The given C++ code provides a solution to find the length of the longest substring that contains at most two distinct characters. Here’s a breakdown of how the code efficiently accomplishes this:
- The function
longestDistinctTwoCharSubstring
accepts a strings
as input. - After initializing required variables and a map to track character positions in the substring, the code dynamically expands and contracts the substring within a single pass using two pointers strategy -
start
andend
. - The variable
end
is incremented to expand the substring and the characters and their latest positions are stored inpositionMap
. - If the map grows to contain three distinct characters, the program identifies the character with the least recent position (the smallest index) and removes it from the map. The
start
pointer is then adjusted to the next index following the removed character, effectively contracting the substring. - Throughout this process, it continually updates the
longest
variable to maintain the length of the longest valid substring found. - This loop continues until the end of the string is reached, ensuring all possible substrings are considered.
The approach balances efficient substring expansion and contraction with dynamic tracking of character positions, ensuring optimal performance with a complexity primarily driven by the string length. This method is particularly efficient for cases where the string is significantly longer than the alphabet size used.
class Solution {
public int maxSubstringLengthTwoDistinct(String str) {
int strLen = str.length();
if (strLen < 3) return strLen;
int start = 0;
int end = 0;
HashMap<Character, Integer> charPositionMap = new HashMap<>();
int maxLength = 2;
while (end < strLen) {
charPositionMap.put(str.charAt(end), end++);
if (charPositionMap.size() == 3) {
int minIndex = Collections.min(charPositionMap.values());
charPositionMap.remove(str.charAt(minIndex));
start = minIndex + 1;
}
maxLength = Math.max(maxLength, end - start);
}
return maxLength;
}
}
This solution in Java addresses the problem of finding the length of the longest substring that contains at most two distinct characters.
First, check if the length of the input string is less than three. If so, return the string length directly, as it naturally contains fewer than or equal to two distinct characters.
Initialize two pointers,
start
andend
, to manage the sliding window within the string, as well as aHashMap
calledcharPositionMap
to store the most recent positions of characters.Set an initial
maxLength
to 2, as the minimum length of substring when the window has two distinct characters.Use a loop to iterate through the string. For each character at the position
end
, store its position incharPositionMap
and incrementend
.When
charPositionMap
has three distinct characters, identify the character with the minimum index in the map and remove it to maintain only two distinct characters in the map. Adjust thestart
pointer just past the removed character's position.Continuously update
maxLength
with the larger value between the currentmaxLength
and the length of the current valid substring (fromstart
toend
).Finally, return the computed
maxLength
.
This approach efficiently manages the search space using a sliding window technique and a map to track character positions, ensuring that the substring always contains two or fewer distinct characters.
struct storage {
char keyId;
int value;
UT_hash_handle hhRef;
};
int findMaxLengthTwoDistinctCharacters(char *str) {
size_t length = strlen(str);
if (length < 3) {
return length;
}
int start = 0, end = 0;
struct storage *storageMap = NULL, *item, *tempItem;
int maxLength = 2;
while (end < length) {
HASH_FIND(hhRef, storageMap, &str[end], sizeof(char), item);
if (item != NULL) {
HASH_DEL(storageMap, item);
free(item);
}
item = (struct storage *)malloc(sizeof(*item));
if (item == NULL) return -1;
item->keyId = str[end];
item->value = end;
HASH_ADD_KEYPTR(hhRef, storageMap, &item->keyId, sizeof(char), item);
end++;
if (HASH_COUNT(storageMap) == 3) {
int minIdx = INT_MAX;
HASH_ITER(hhRef, storageMap, item, tempItem) {
minIdx = minIdx < item->value ? minIdx : item->value;
}
HASH_FIND(hhRef, storageMap, &str[minIdx], sizeof(char), item);
if (item != NULL) {
HASH_DEL(storageMap, item);
free(item);
}
start = minIdx + 1;
}
maxLength = maxLength > (end - start) ? maxLength : (end - start);
}
return maxLength;
}
This solution addresses the problem of finding the length of the longest substring with at most two distinct characters from a given string. The implementation, written in C, uses a combination of a hashing technique and a sliding window approach.
- A
struct storage
is defined to store each character (keyId
) and its latest position (value
) in the string.UT_hash_handle
is utilized to enable hashing of the struct. - The function
findMaxLengthTwoDistinctCharacters
receives a pointer to the character string. It initially checks if the string length is less than 3, immediately returning the string's length since the result is trivial in such cases. - A sliding window is utilized with two pointers,
start
andend
, initially at the start of the string.storageMap
is introduced to keep track of up to two unique characters and their last occurrences. - As
end
increases, it examines each character until the end of the string:- If the character already exists in
storageMap
, its entry gets updated. - For each new character, it creates a new
struct storage
, assigns the character and its current index, and adds it to thestorageMap
. - The sliding window adjusts when
storageMap
exceeds two distinct characters by removing the character with the smallest index and moving thestart
pointer forward.
- If the character already exists in
- The length of the current valid substring (
end - start
) is compared againstmaxLength
to keep track of the maximum found so far. - When the traversal is complete, the function returns
maxLength
, representing the length of the longest substring with at most two distinct characters.
This implementation effectively manages the sliding window and character index with dynamic memory management and hashing structures, providing an efficient solution to the problem.
var maxSubstringTwoDistinctChars = function (str) {
let len = str.length;
if (len < 3) return len;
let start = 0;
let end = 0;
let charPositions = {};
let max_length = 2;
while (end < len) {
charPositions[str[end]] = end++;
if (Object.keys(charPositions).length === 3) {
let minIndex = Math.min(...Object.values(charPositions));
delete charPositions[str[minIndex]];
start = minIndex + 1;
}
max_length = Math.max(max_length, end - start);
}
return max_length;
};
This JavaScript solution addresses the problem of finding the longest substring that contains at most two distinct characters in a given string.
The function maxSubstringTwoDistinctChars
starts by checking if the string length is less than three. If it is, the function directly returns the string length, as any string shorter than three inevitably meets the criteria.
The core logic involves two pointers, start
and end
, which initially point to the beginning of the string. The charPositions
object keeps track of the last occurrence index of each character encountered as end
traverses through the string.
- Initialize
max_length
to 2, since the minimum meaningful output when two distinct characters are present is a substring of length 2. - Use a while loop to iterate through the string. For each character at position
end
, update its latest occurrence incharPositions
and incrementend
. - Check if the number of distinct characters, determined by the key count in
charPositions
, exceeds two. If it does, calculate the minimum index from the current character positions to find the oldest character position, remove this character fromcharPositions
, and updatestart
to focus only on the substring from the next index of this oldest character. - Continuously update
max_length
with the maximum between its current value and the length of the current substring defined bystart
toend
.
At the end of the iteration, max_length
contains the length of the longest substring composed of at most two distinct characters. This is returned as the final result, presenting an efficient approach with a time complexity of O(n), where n is the length of the input string.
from collections import defaultdict
class Solution:
def longestSubstringTwoDistinctChars(self, s: str) -> int:
length = len(s)
if length < 3:
return length
# Initialize window pointers and character map
start, end = 0, 0
char_map = defaultdict()
longest = 2
while end < length:
# Update the current character's last seen index
char_map[s[end]] = end
end += 1
# If the map contains 3 distinct characters, shrink the window
if len(char_map) == 3:
# Find the index of the oldest character and remove it
oldest = min(char_map.values())
del char_map[s[oldest]]
# Adjust the start of the window
start = oldest + 1
# Calculate the maximum length encountered
longest = max(longest, end - start)
return longest
The Python program provided tackles the challenge of finding the length of the longest substring that contains at most two distinct characters from a given string s
. Follow these steps to understand the process:
- Check if the length of string
s
is less than 3. Return this length as any substring will have at most two distinct characters in this case. - Initialize two pointers,
start
andend
, to manage the sliding window and a dictionarychar_map
to track the characters and their indices within the window. - Set the initial value of
longest
to 2, which will store the maximum length of substrings found that meet the criteria. - Utilize a while loop to iterate as long as
end
is less than the string length. Within the loop:- Map the current character at position
end
to its index. - Increment the
end
pointer to slide the window to the right. - If
char_map
contains three distinct characters, identify the index of the earliest occurring character (oldest
) and remove it from thechar_map
. Then, adjust thestart
pointer to the position right after this oldest character. - Update
longest
with the larger value betweenlongest
and the current window size (end - start
).
- Map the current character at position
- Finally, return the value stored in
longest
as the length of the longest substring with at most two distinct characters.
This solution effectively manages the sliding window and utilizes a dictionary to keep track of character occurrences, optimizing the process of finding the correct substring length.
No comments yet.