Find Longest Special Substring That Occurs Thrice II

Updated on 28 May, 2025
Find Longest Special Substring That Occurs Thrice II header image

Problem Statement

In this problem, you are provided with a string s which consists solely of lowercase English letters. The task is to find the length of the longest "special" substring from s that appears at least three times. A substring is considered "special" if it consists of only one type of character repeated continuously. If no such substring occurs three times, you should return -1. This problem requires you to efficiently identify repeating character sequences within the string and determine the maximum length of those that meet the frequency condition.

Examples

Example 1

Input:

s = "aaaa"

Output:

2

Explanation:

The longest special substring which occurs at least three times is "aa". It appears at positions: 0–1, 1–2, and 2–3.

Example 2

Input:

s = "abcdef"

Output:

-1

Explanation:

There exists no special substring which occurs at least three times. Hence return -1.

Example 3

Input:

s = "abcaba"

Output:

1

Explanation:

The longest special substring which occurs at least three times is "a". It appears at positions: 0, 3, and 5.

Constraints

  • 3 <= s.length <= 5 * 10^5
  • s[i] consists of lowercase English letters ('a' to 'z')

Approach and Intuition

  1. Identify Continuous Substrings:

    • Begin by traversing the string s from the start to the end.
    • Identify all continuous chunks of the same character. For instance, if a part of the string looks like "aaabbcc," the identified chunks would be "aaa," "bb," and "cc."
  2. Calculate Lengths and Frequency:

    • For each chunk or segment of identical characters, record its length.
    • Maintain a frequency distribution of the lengths of these segments to easily check how often a segment of a particular length appears.
  3. Determine the Longest Substring Fulfilling the Criteria:

    • Now inspect the recorded lengths. Starting from the longest, check if any are repeated at least three times.
    • If such a length exists, it would be your answer since you will be checking lengths in descending order.
    • If no lengths repeat the required number of times, return -1 as specified.
  4. Edge Cases:

    • If the string has less diversity in characters or is almost entirely made up of a single letter (but less than three characters of the kind), results can promptly be determined.
    • The algorithm is optimized by stopping early if the maximum possible special substring can no longer meet the needed frequency as traversal continues.

In summary, the approach banks on grouping like characters together and rapidly assessing their lengths and frequencies, making effective use of associative arrays (dictionaries) for quick lookup and update operations. This method ensures that the task is executed in a time-efficient manner.

Solutions

  • C++
  • Java
  • Python
cpp
class Solution {
public:
    int& smallest(int& x, int& y, int& z) {
        return x < (y < z ? y : z) ? x : y < z ? y : z;
    }
    int longestSubstring(string str) {
        int currentLength = 0, maxResult = -1;
        char lastChar;
        vector<vector<int>> lengthsCount(26, vector<int>(3, -1));
        for (char ch : str) {
            if (ch == lastChar) {
                currentLength++;
            } else {
                currentLength = 1;
                lastChar = ch;
            }
            int& minLen = smallest(lengthsCount[ch - 'a'][0],
                                   lengthsCount[ch - 'a'][1],
                                   lengthsCount[ch - 'a'][2]);
            if (currentLength > minLen) minLen = currentLength;
        }

        for (char ch = 'a'; ch <= 'z'; ch++)
            maxResult = max(maxResult, smallest(lengthsCount[ch - 'a'][0],
                                                lengthsCount[ch - 'a'][1],
                                                lengthsCount[ch - 'a'][2]));
        return maxResult;
    }
};

In the given C++ solution, the goal is to find the length of the longest substring that contains a character occurring exactly three times consecutively in a string str. The implementation effectively tracks the lengths of contiguous substrings made up of the same character and adjusts the counts accordingly using a specialized comparison function.

  • The solution is encapsulated in a class Solution with a private method and a public method:

    • smallest(int& x, int& y, int& z) - Returns the smallest of three integers by value, enabling the tracking of the three longest occurrences of each character.
    • longestSubstring(string str) - This is the main function that iterates over the string to determine the longest substring length where any character appears exactly three times consecutively.
  • In longestSubstring:

    • A 2D vector lengthsCount of dimension 26x3 is initialized to -1. Each row corresponds to a character from 'a' to 'z', and each column holds the top three lengths of consecutive character occurrences.
    • The function iterates over the characters of the input string. For each character:
      1. If the current character is the same as the last one, increments the currentLength.
      2. If it is different, resets currentLength to 1.
      3. A reference to the minimum of the three tracked lengths for the current character (minLen) is obtained using the smallest function. minLen is updated if currentLength exceeds its value.
    • After processing the string, the function computes the maximum value among the smallest values of the three tracked lengths for each character, determining the longest substring length where the same character appears thrice.
  • Finally, longestSubstring returns maxResult, which is the maximum length among all characters' three occurrences or -1 if no such substring exists.

java
class Processor {
    public int smallest(int x, int y, int z) {
        return x < Math.min(y, z) ? x : Math.min(y, z);
    }

    public int maxSubLength(String str) {
        int currentLength = 0, maxResult = -1;
        char lastChar = '\0';
        int[][] data = new int[26][3];

        // Set data array to -1
        for (int i = 0; i < 26; i++) {
            for (int j = 0; j < 3; j++) {
                data[i][j] = -1;
            }
        }

        for (char c : str.toCharArray()) {
            if (c == lastChar) {
                currentLength++;
            } else {
                currentLength = 1;
                lastChar = c;
            }

            // Update data if currentLength is greater than one of the stored lengths
            int idx = c - 'a';
            int least = smallest(
                data[idx][0],
                data[idx][1],
                data[idx][2]
            );
            if (currentLength > least) {
                if (data[idx][0] == least) {
                    data[idx][0] = currentLength;
                } else if (data[idx][1] == least) {
                    data[idx][1] = currentLength;
                } else {
                    data[idx][2] = currentLength;
                }
            }
        }

        // Calculate the max of minimum values stored in data
        for (int i = 0; i < 26; i++) {
            maxResult = Math.max(
                maxResult,
                smallest(
                    data[i][0],
                    data[i][1],
                    data[i][2]
                )
            );
        }

        return maxResult;
    }
}

The Java solution provided resolves the challenge of finding the longest special substring that occurs precisely three times.

To understand the approach, focus on these key operations:

  • Smallest Element Finder: The smallest method is a helper function that determines the smallest number among three integers, effectively ensuring comparisons are streamlined in other parts of the code.

  • Initialization of Arrays: Start by initializing a 2D array data sized 26x3 (accounting for each letter of the alphabet and storing up to three lengths). Initialize all values to -1 as default, establishing a baseline for updates as substrings are evaluated.

  • Character Processing: Iterate through each character of the string, comparing it to the last seen character. If it's the same, increment the current substring length (currentLength), otherwise reset this length to 1. Store the maximum length of these substrings for each character, ensuring to replace the shortest recorded length when a new longer substring is found.

  • Updating Data and Storage Logic: For each character processing cycle, update the data array, ensuring that you're only updating the length if a freshly calculated currentLength exceeds one of the previously stored lengths for that character.

  • Result Calculation: Finally, determine the longest substring seen three times by calculating the maximum of the smallest values stored in data.

This logic manages to efficiently track and update the longest lengths of recurring substrings in a single pass, using straightforward array manipulation and helper functions to maintain clarity and performance. This approach is particularly effective for strings where character distributions and repeating patterns are uneven, making it robust for a variety of inputs.

python
class Solution:
    def maximumSubstringLength(self, input_str: str) -> int:
        max_length = 0
        best_length = -1
        last_char = ""
        record_lengths = [[-1, -1, -1] for _ in range(26)]
        for char in input_str:
            if char == last_char:
                max_length += 1
            else:
                max_length = 1
                last_char = char

            # Update the list with new max_length if it's greater than one of the three stored lengths
            min_val = min(record_lengths[ord(char) - ord("a")])
            if max_length > min_val:
                char_index = ord(char) - ord("a")
                record_lengths[char_index][record_lengths[char_index].index(min_val)] = max_length

        # Calculate the greatest minimum value among all characters
        for i in range(26):
            best_length = max(best_length, min(record_lengths[i]))

        return best_length

The provided Python code defines a function maximumSubstringLength within a Solution class that computes the length of the longest substring which appears at least three times consecutively for any character within the given string input_str. The function uses a strategy that involves computing repeated character sequences and updating and assessing minimum values conditionally. Here’s a breakdown of the approach:

  • Initialize tracking variables: max_length to track the current maximum length of a repeating character, best_length to store the result, and last_char to remember the last visited character.
  • record_lengths is a list of lists, where each sublist corresponds to a character in the alphabet recording the top three lengths of sequences that character has repeated consecutively.
  • For each character in input_str:
    • Check if it's the same as the last character to either increment max_length or reset it.
    • Update record_lengths for that character if max_length exceeds one of its recorded lengths.
  • Complete by determining the maximum minimal value from record_lengths which represents the longest special substring length that appears at least three times.

This method optimizes the checking and recording processes for characters, ensuring efficient updating of sequence lengths using comparisons and list operations.

Comments

No comments yet.