Find K-Length Substrings With No Repeated Characters

Updated on 26 May, 2025
Find K-Length Substrings With No Repeated Characters header image

Problem Statement

Given a string s and an integer k, the problem is to determine how many distinct substrings of length k can be formed from s where each substring has all unique characters. A substring is defined as a contiguous sequence of characters within the string. Each calculated substring must exhibit unique usage of characters without any repetition.

Examples

Example 1

Input:

s = "havefunonvultr", k = 5

Output:

6

Explanation:

There are 6 substrings they are: 'havef','avefu','vefun','efuno','etcod','tcode'.

Example 2

Input:

s = "home", k = 5

Output:

0

Explanation:

Notice k can be larger than the length of s. In this case, it is not possible to find any substring.

Constraints

  • 1 <= s.length <= 104
  • s consists of lowercase English letters.
  • 1 <= k <= 104

Approach and Intuition

  1. Understanding the constraints:

    • The length of string s can be as small as 1 or as large as 10,000.
    • The integer k also ranges between 1 to 10,000.
    • It is crucial to note that s only contains lowercase English letters.
  2. Explanation based on given examples:

    • In the first given input s = "havefunonvultr", k = 5, the task is to identify all possible substrings of length 5 with all unique characters. The identified substrings such as 'havef', 'avefu', 'vefun', etc., are unique sequences of 5 characters.
    • In the second example input s = "home", k = 5, since k is greater than the length of s, it's impossible to have any substrings of length k. Therefore, the output is 0.
  3. Steps to approach:

    • Iterate through the string s while maintaining a sliding window of size k.
    • Check if characters within the current window are unique. This can be managed using a set or dictionary to track characters.
    • If the window represents a substring of unique characters, increment a counter.
    • Continue this until the end of string s.
    • Return the counter as the result.

The approach leverages a straightforward sliding window mechanism to ensure all potential substrings of the specified length are evaluated for uniqueness of characters. Given that s is comprised only of lowercase letters, utilization of a set for constant time checks substantially optimizes the process.

Solutions

  • C++
  • Java
  • Python
cpp
class Solution {
public:
    int countKLengthSubstrings(string s, int k) {
        if (k > 26) return 0;

        int count = 0;
        int n = s.length();

        int start = 0, end = 0;
        int charCount[26] = {0};

        while (end < n) {
            charCount[s[end] - 'a']++;

            while (charCount[s[end] - 'a'] > 1) {
                charCount[s[start] - 'a']--;
                start++;
            }

            if (end - start + 1 == k) {
                count++;
                charCount[s[start] - 'a']--;
                start++;
            }

            end++;
        }

        return count;
    }
};

This solution provides an efficient way to count all possible k-length substrings in a given string s that do not contain any repeated characters using C++.

  • Initialize a count to zero which will store the result, representing the number of k-length substrings with all unique characters.
  • Declare an array charCount to maintain the frequency of each character within a current window of the string.
  • Utilize two pointers, start and end, to define the current window in the string. The end pointer expands the window and the start pointer contracts it when a repeated character is found.
  • Move the end pointer from the beginning to the end of the string. Increase the count of the current character pointed by end.
  • If a character gets repeated (frequency becomes more than 1), contract the window from the start until there are no repeated characters in the window.
  • If the window size matches k (the desired substring length), increment the count of valid substrings. Then, contract the window from the start to consider new possible windows.
  • Finally, return count, which contains the total number of k-length substrings consisting of unique characters.

This approach ensures that each possible substring of size k is checked only once, maintaining a running record of character counts within the current window, leading to an optimal solution with a time complexity primarily dependent on the length of the string s.

java
class Solution {
    public int countKLenSubstrUniqueChars(String str, int length) {
        if (length > 26) return 0;  // Since only 26 unique characters can exist

        int resultCount = 0;
        int strLength = str.length();

        // Sliding window bounds, left and right pointers
        int left = 0, right = 0;
        // Occurrences of each character
        int[] occurences = new int[26];
        
        while (right < strLength) {
            // Increment character count
            occurences[str.charAt(right) - 'a']++;
            
            // If duplicate found, move left pointer to make characters unique
            while (occurences[str.charAt(right) - 'a'] > 1) {
                occurences[str.charAt(left) - 'a']--;
                left++;
            }
            
            // Valid unique substring of size 'length'
            if (right - left + 1 == length) {
                resultCount++;
                
                // Move left to search for new substring
                occurences[str.charAt(left) - 'a']--;
                left++;
            }
            
            // Move right pointer to expand window
            right++;
        }
        
        return resultCount;
    }
}

The Java implementation provided involves a function designed to count the number of substrings of a specified length (length) that contain all unique characters within the given string (str). This solution utilizes the sliding window technique along with a frequency array to ensure each character within the current window is unique.

  • Start by checking if the desired substring length exceeds 26. Return 0 in such cases as it is impossible to have more than 26 unique characters with only the English alphabet.
  • Initialize resultCount to track the count of valid substrings and occurences, an array to monitor the frequency of each character as you move through the string.
  • Use two pointers, left and right, to manage the bounds of the sliding window. left controls the start of the substring, while right extends the substring.
  • Increment the count of the current character (given by right pointer) in the occurrences array.
  • If the count of a character exceeds one, implying a duplicate, adjust the left pointer to restore uniqueness within the sliding window.
  • Whenever the size of the window matches length, increment the resultCount because a valid substring is identified. Subsequently, adjust the left pointer and corresponding character frequency as the window shifts.
  • Continue to move the right pointer to explore further substrings until all potential substrings are examined.
  • Finally, return the resultCount, which represents the total valid substrings fitting the criteria.

This logic ensures efficient substring verification by dynamically adjusting a constrained window of characters, thereby tallying up substrings that fulfill the uniqueness requirement without redundant checks.

python
class Solution:
    def countUniqueSubstr(self, input_string: str, length: int) -> int:
        if length > 26:
            return 0
        total_count = 0
        string_length = len(input_string)
        left_pointer = right_pointer = 0
        char_count = [0] * 26
        
        def char_index(character: str) -> int:
            return ord(character) - ord('a')
        
        while right_pointer < string_length:
            char_count[char_index(input_string[right_pointer])] += 1
            
            while char_count[char_index(input_string[right_pointer])] > 1:
                char_count[char_index(input_string[left_pointer])] -= 1
                left_pointer += 1
            
            if right_pointer - left_pointer + 1 == length:
                total_count += 1
                char_count[char_index(input_string[left_pointer])] -= 1
                left_pointer += 1
            
            right_pointer += 1

        return total_count

This solution in Python utilizes the sliding window technique to solve the problem of finding all substrings of a specified length, k, that contain no repeated characters within them. Here’s how the solution is structured:

  • First, handle an edge case: if length exceeds 26, immediately return 0, as the English alphabet contains only 26 characters, making it impossible to have a longer substring without repeated characters.

  • Initialize total_count to keep track of the count of valid substrings, and left_pointer and right_pointer to manage the boundaries of the current substring window within input_string.

  • Use a list char_count of size 26 (representing each letter of the alphabet) to record the frequency of each character in the current window. This helps in quickly checking whether a letter is repeated in the window.

  • Define a helper function char_index that returns the index of a character within char_count using ASCII values.

  • As you expand right_pointer to include new characters in the window:

    • Increase the count of the character at right_pointer.
    • If this character's count exceeds 1 (indicating a repetition), contract the window from the left by incrementing left_pointer and adjusting counts, until all characters are unique again.
  • If the window size matches length, record this as a valid substring:

    • Increment total_count since the characters between left_pointer and right_pointer are unique.
    • Move left_pointer one character right to explore a new potential substring in the next loop iterations.
  • Continue expanding right_pointer to explore subsequent potential substrings until it surpasses the bounds of input_string.

  • Finally, return total_count which stores the number of valid substrings found.

This algorithm efficiently navigates the input string while maintaining an optimal check for repeated characters using the sliding window strategy, ensuring the substrings checked always meet the condition of having no repeated characters within the specified length.

Comments

No comments yet.