
Problem Statement
In this problem, you are provided with a string s
which consists solely of lowercase English letters. The task is to find the length of the longest "special" substring from s
that appears at least three times. A substring is considered "special" if it consists of only one type of character repeated continuously. If no such substring occurs three times, you should return -1. This problem requires you to efficiently identify repeating character sequences within the string and determine the maximum length of those that meet the frequency condition.
Examples
Example 1
Input:
s = "aaaa"
Output:
2
Explanation:
The longest special substring which occurs at least three times is "aa". It appears at positions: 0–1, 1–2, and 2–3.
Example 2
Input:
s = "abcdef"
Output:
-1
Explanation:
There exists no special substring which occurs at least three times. Hence return -1.
Example 3
Input:
s = "abcaba"
Output:
1
Explanation:
The longest special substring which occurs at least three times is "a". It appears at positions: 0, 3, and 5.
Constraints
3 <= s.length <= 5 * 10^5
s[i]
consists of lowercase English letters ('a'
to'z'
)
Approach and Intuition
Identify Continuous Substrings:
- Begin by traversing the string
s
from the start to the end. - Identify all continuous chunks of the same character. For instance, if a part of the string looks like "aaabbcc," the identified chunks would be "aaa," "bb," and "cc."
- Begin by traversing the string
Calculate Lengths and Frequency:
- For each chunk or segment of identical characters, record its length.
- Maintain a frequency distribution of the lengths of these segments to easily check how often a segment of a particular length appears.
Determine the Longest Substring Fulfilling the Criteria:
- Now inspect the recorded lengths. Starting from the longest, check if any are repeated at least three times.
- If such a length exists, it would be your answer since you will be checking lengths in descending order.
- If no lengths repeat the required number of times, return -1 as specified.
Edge Cases:
- If the string has less diversity in characters or is almost entirely made up of a single letter (but less than three characters of the kind), results can promptly be determined.
- The algorithm is optimized by stopping early if the maximum possible special substring can no longer meet the needed frequency as traversal continues.
In summary, the approach banks on grouping like characters together and rapidly assessing their lengths and frequencies, making effective use of associative arrays (dictionaries) for quick lookup and update operations. This method ensures that the task is executed in a time-efficient manner.
Solutions
- C++
- Java
- Python
class Solution {
public:
int& smallest(int& x, int& y, int& z) {
return x < (y < z ? y : z) ? x : y < z ? y : z;
}
int longestSubstring(string str) {
int currentLength = 0, maxResult = -1;
char lastChar;
vector<vector<int>> lengthsCount(26, vector<int>(3, -1));
for (char ch : str) {
if (ch == lastChar) {
currentLength++;
} else {
currentLength = 1;
lastChar = ch;
}
int& minLen = smallest(lengthsCount[ch - 'a'][0],
lengthsCount[ch - 'a'][1],
lengthsCount[ch - 'a'][2]);
if (currentLength > minLen) minLen = currentLength;
}
for (char ch = 'a'; ch <= 'z'; ch++)
maxResult = max(maxResult, smallest(lengthsCount[ch - 'a'][0],
lengthsCount[ch - 'a'][1],
lengthsCount[ch - 'a'][2]));
return maxResult;
}
};
In the given C++ solution, the goal is to find the length of the longest substring that contains a character occurring exactly three times consecutively in a string str
. The implementation effectively tracks the lengths of contiguous substrings made up of the same character and adjusts the counts accordingly using a specialized comparison function.
The solution is encapsulated in a class
Solution
with a private method and a public method:smallest(int& x, int& y, int& z)
- Returns the smallest of three integers by value, enabling the tracking of the three longest occurrences of each character.longestSubstring(string str)
- This is the main function that iterates over the string to determine the longest substring length where any character appears exactly three times consecutively.
In
longestSubstring
:- A 2D vector
lengthsCount
of dimension 26x3 is initialized to -1. Each row corresponds to a character from 'a' to 'z', and each column holds the top three lengths of consecutive character occurrences. - The function iterates over the characters of the input string. For each character:
- If the current character is the same as the last one, increments the
currentLength
. - If it is different, resets
currentLength
to 1. - A reference to the minimum of the three tracked lengths for the current character (
minLen
) is obtained using thesmallest
function.minLen
is updated ifcurrentLength
exceeds its value.
- If the current character is the same as the last one, increments the
- After processing the string, the function computes the maximum value among the smallest values of the three tracked lengths for each character, determining the longest substring length where the same character appears thrice.
- A 2D vector
Finally,
longestSubstring
returnsmaxResult
, which is the maximum length among all characters' three occurrences or-1
if no such substring exists.
class Processor {
public int smallest(int x, int y, int z) {
return x < Math.min(y, z) ? x : Math.min(y, z);
}
public int maxSubLength(String str) {
int currentLength = 0, maxResult = -1;
char lastChar = '\0';
int[][] data = new int[26][3];
// Set data array to -1
for (int i = 0; i < 26; i++) {
for (int j = 0; j < 3; j++) {
data[i][j] = -1;
}
}
for (char c : str.toCharArray()) {
if (c == lastChar) {
currentLength++;
} else {
currentLength = 1;
lastChar = c;
}
// Update data if currentLength is greater than one of the stored lengths
int idx = c - 'a';
int least = smallest(
data[idx][0],
data[idx][1],
data[idx][2]
);
if (currentLength > least) {
if (data[idx][0] == least) {
data[idx][0] = currentLength;
} else if (data[idx][1] == least) {
data[idx][1] = currentLength;
} else {
data[idx][2] = currentLength;
}
}
}
// Calculate the max of minimum values stored in data
for (int i = 0; i < 26; i++) {
maxResult = Math.max(
maxResult,
smallest(
data[i][0],
data[i][1],
data[i][2]
)
);
}
return maxResult;
}
}
The Java solution provided resolves the challenge of finding the longest special substring that occurs precisely three times.
To understand the approach, focus on these key operations:
Smallest Element Finder: The
smallest
method is a helper function that determines the smallest number among three integers, effectively ensuring comparisons are streamlined in other parts of the code.Initialization of Arrays: Start by initializing a 2D array
data
sized 26x3 (accounting for each letter of the alphabet and storing up to three lengths). Initialize all values to -1 as default, establishing a baseline for updates as substrings are evaluated.Character Processing: Iterate through each character of the string, comparing it to the last seen character. If it's the same, increment the current substring length (
currentLength
), otherwise reset this length to 1. Store the maximum length of these substrings for each character, ensuring to replace the shortest recorded length when a new longer substring is found.Updating Data and Storage Logic: For each character processing cycle, update the
data
array, ensuring that you're only updating the length if a freshly calculatedcurrentLength
exceeds one of the previously stored lengths for that character.Result Calculation: Finally, determine the longest substring seen three times by calculating the maximum of the smallest values stored in
data
.
This logic manages to efficiently track and update the longest lengths of recurring substrings in a single pass, using straightforward array manipulation and helper functions to maintain clarity and performance. This approach is particularly effective for strings where character distributions and repeating patterns are uneven, making it robust for a variety of inputs.
class Solution:
def maximumSubstringLength(self, input_str: str) -> int:
max_length = 0
best_length = -1
last_char = ""
record_lengths = [[-1, -1, -1] for _ in range(26)]
for char in input_str:
if char == last_char:
max_length += 1
else:
max_length = 1
last_char = char
# Update the list with new max_length if it's greater than one of the three stored lengths
min_val = min(record_lengths[ord(char) - ord("a")])
if max_length > min_val:
char_index = ord(char) - ord("a")
record_lengths[char_index][record_lengths[char_index].index(min_val)] = max_length
# Calculate the greatest minimum value among all characters
for i in range(26):
best_length = max(best_length, min(record_lengths[i]))
return best_length
The provided Python code defines a function maximumSubstringLength
within a Solution
class that computes the length of the longest substring which appears at least three times consecutively for any character within the given string input_str
. The function uses a strategy that involves computing repeated character sequences and updating and assessing minimum values conditionally. Here’s a breakdown of the approach:
- Initialize tracking variables:
max_length
to track the current maximum length of a repeating character,best_length
to store the result, andlast_char
to remember the last visited character. record_lengths
is a list of lists, where each sublist corresponds to a character in the alphabet recording the top three lengths of sequences that character has repeated consecutively.- For each character in
input_str
:- Check if it's the same as the last character to either increment
max_length
or reset it. - Update
record_lengths
for that character ifmax_length
exceeds one of its recorded lengths.
- Check if it's the same as the last character to either increment
- Complete by determining the maximum minimal value from
record_lengths
which represents the longest special substring length that appears at least three times.
This method optimizes the checking and recording processes for characters, ensuring efficient updating of sequence lengths using comparisons and list operations.
No comments yet.