Count Substrings Without Repeating Character

Updated on 21 May, 2025
Count Substrings Without Repeating Character header image

Problem Statement

In this problem, we are provided with a string s, which is composed solely of lowercase English alphabets. We need to calculate the number of special substrings contained within this string. A substring is deemed special if it is composed of unique characters without any repetitions. Our objective is to identify all such substrings, count them, and return the total count. Understanding what constitutes a substring is crucial; it is a sequence of characters derived from the string such that it is a contiguous block within the string.

Examples

Example 1

Input:

s = "abcd"

Output:

10

Explanation:

Since each character occurs once, every substring is a special substring. We have 4 substrings of length one, 3 of length two, 2 of length three, and 1 substring of length four. So overall there are 4 + 3 + 2 + 1 = 10 special substrings.

Example 2

Input:

s = "ooo"

Output:

3

Explanation:

Any substring with a length of at least two contains a repeating character. So we have to count the number of substrings of length one, which is 3.

Example 3

Input:

s = "abab"

Output:

7

Explanation:

Special substrings are as follows (sorted by their start positions):
Special substrings of length 1: "a", "b", "a", "b"
Special substrings of length 2: "ab", "ba", "ab"
And it can be shown that there are no special substrings with a length of at least three. So the answer would be 4 + 3 = 7.

Constraints

  • 1 <= s.length <= 105
  • s consists of lowercase English letters

Approach and Intuition

The key challenge is to find substrings that do not contain any repeating characters and then to keep a count of all such substrings. Here’s a step-by-step breakdown of how we might approach this:

  1. Understanding Substring Properties: A substring is special if all characters within it are unique. We begin by examining the smallest substrings (single characters) which are always special.

  2. Counting Single-Character Substrings: Given that every single character is unique in itself, there are as many single-character special substrings as there are characters in the string s.

  3. Expanding the Search to Larger Substrings: To manage this efficiently, we can use a sliding window approach. Start with a window covering the initial part of the string and expand the window until a repeating character is encountered.

  4. Tracking Repeated Characters: Use data structures like hashmaps or sets to keep track of the characters in the current window and detect repetitions. When a repeat is detected, adjust the window to exclude the repeat and continue counting.

  5. Counting Non-Repeating Substrings: Expand the window from each character in the string unless a repeat is found. For each initial position, the window can vary in size from 1 up to the length of the substring that can be formed without repeating characters.

  6. Optimization Consideration: Directly calculating the substring count using brute force can be highly inefficient, especially for large strings. Instead, calculating potential substrings via the sliding window ensures that you only examine feasible substrings, saving computational time.

Example Scenario from Example 3:

When we analyze "abab", starting with the first character:

  • Begin with "a" — it’s valid.
  • Expand to "ab" — still valid.
  • Try "aba" — not valid as 'a' repeats. Move the window to start from the next character. Continue this until all characters are examined as starting points.

In this illustration, the effective handling of the sliding window in conjunction with set operations (for tracking characters within the current window) would allow for efficient detection and counting of all special substrings.

Solutions

  • C++
  • Java
  • Python
cpp
class Solution {
public:
    int countUniqueSubstrings(string str) {
        int count = 0;
        int left = 0;
        int letterFrequency[26] = {0};
        for (int right = 0; right < str.size(); right++) {
            letterFrequency[str[right] - 'a']++;

            while (letterFrequency[str[right] - 'a'] > 1) {
                letterFrequency[str[left] - 'a']--;
                left++;
            }

            count += (right - left + 1);
        }

        return count;
    }
};

In the provided C++ solution, the goal is to count all substrings of a given string that contain only unique characters. The algorithm makes use of the sliding window technique paired with a frequency array to efficiently solve the problem.

Here's a breakdown of how the solution works:

  • Initialize count to zero. This variable holds the total count of substrings without repeating characters.
  • Use two pointers, left and right, to represent the current window of characters considered. Initialize left to 0.
  • Create an array letterFrequency of size 26 (to represent each letter of the alphabet). Initialize all entries to zero.
  • Traverse the input string using the right pointer:
    • Increase the count of the current character in the letterFrequency array.
    • If the current character's frequency exceeds one, which means a repetition, adjust the left pointer:
      • Move the left pointer to the right to shrink the window until no characters in the current window repeat.
      • Decrease the frequency of the character being skipped.
    • Calculate the number of substrings ending at the current right index which do not contain repeating characters. This is given by (right - left + 1) and add it to count.
  • Continue this until the right pointer has processed all characters in the string.
  • Return the count.

The solution effectively counts valid substrings in O(n) time using a constant space of O(1), as the frequency array's size remains fixed, regardless of the input size. This approach is optimal for cases where the input string consists of a fixed and limited character set.

java
class Solution {

    public int countUniqueSubstrings(String inputStr) {
        int totalCount = 0;

        int leftPointer = 0;
        int[] letterFrequency = new int[26];
        for (int rightPointer = 0; rightPointer < inputStr.length(); rightPointer++) {
            letterFrequency[inputStr.charAt(rightPointer) - 'a']++;

            while (letterFrequency[inputStr.charAt(rightPointer) - 'a'] > 1) {
                letterFrequency[inputStr.charAt(leftPointer) - 'a']--;
                leftPointer++;
            }

            totalCount += (rightPointer - leftPointer + 1);
        }

        return totalCount;
    }
}

This Java solution efficiently counts the number of substrings without repeating characters in a given string using a sliding window technique. Implement the method countUniqueSubstrings which takes a single string argument, inputStr, and returns the total number of unique substrings.

  • Initialize totalCount to zero to hold the final count of substrings.
  • Use two pointers, leftPointer and rightPointer, to represent the current window of characters being considered.
  • Use an array letterFrequency to keep track of the frequency of each character within the current window.
  • Iterate through the string using rightPointer. For each character:
    • Increase the frequency of the character in letterFrequency.
    • If a character's frequency exceeds 1 (indicating a repeat), increment the leftPointer to shrink the window from the left until all characters within the window are unique again.
    • Update totalCount by adding the size of the current valid window (difference between rightPointer and leftPointer plus 1).
  • Return totalCount at the end, which represents the total number of substrings without repeating characters.

This sliding window algorithm ensures that each substring is checked for repeating characters in optimal time, making the approach efficient for large inputs.

python
class Solution:
    def countUniqueSubstrings(self, input_string: str) -> int:
        count_of_substrings = 0

        left_index = 0
        char_frequency = [0] * 26
        for right_index in range(len(input_string)):
            char_frequency[ord(input_string[right_index]) - ord("a")] += 1

            while char_frequency[ord(input_string[right_index]) - ord("a")] > 1:
                char_frequency[ord(input_string[left_index]) - ord("a")] -= 1
                left_index += 1

            count_of_substrings += right_index - left_index + 1

        return count_of_substrings

This summary details a solution for counting unique substrings without repeating characters within a given string, using Python.

  • Define a class Solution that contains the method countUniqueSubstrings, which takes a single string as its argument.
  • Initialize count_of_substrings to zero, to store the counts of unique substrings.
  • Set left_index to zero, serving as the starting point of the substring being evaluated.
  • Prepare an array char_frequency of size 26 (for the English alphabet), initialized with zeros, to count the frequency of each character within the substring.
  • Use a loop to iterate through each character of the input_string, using right_index as the loop variable.
  • Increment the frequency of the currently considered character based on its ASCII value, adjusted for 'a' (ord("a") equals 97).
  • If the frequency of the current character (pointed by right_index) exceeds one, iteratively decrement the frequency of the character at left_index and increment left_index. This process removes characters from the beginning of the substring until all characters are unique within the current substring.
  • After ensuring all characters between left_index and right_index are unique, add the length of the current valid substring (which is right_index - left_index + 1) to count_of_substrings.
  • Continue the iteration until all characters in input_string have been processed.
  • Return the total count_of_substrings.

By implementing this sliding window technique, coupled with a frequency array to track character occurrences, the method efficiently counts all substrings that contain only unique characters.

Comments

No comments yet.