Unique Substrings With Equal Digit Frequency

Updated on 14 July, 2025
Unique Substrings With Equal Digit Frequency header image

Problem Statement

The task is to determine the number of unique substrings of a given digit string s such that within each substring, every distinct digit appears an equal number of times. For this problem, a substring is considered unique based on its content and not on the positions of the characters in the original string (e.g., two occurrences of the substring "12" in different positions of s are considered the same). This challenge emphasizes the combinatorial and frequency analysis within substrings of a given string.

Examples

Example 1

Input:

s = "1212"

Output:

5

Explanation:

The substrings that meet the requirements are "1", "2", "12", "21", "1212".
Note that although the substring "12" appears twice, it is only counted once.

Example 2

Input:

s = "12321"

Output:

9

Explanation:

The substrings that meet the requirements are "1", "2", "3", "12", "23", "32", "21", "123", "321".

Constraints

  • 1 <= s.length <= 1000
  • s consists of digits.

Approach and Intuition

To solve the problem effectively, follow these steps:

  1. Iterate over all possible substrings of the string s. The total number of substrings of a string of length n is (n*(n+1))/2.
  2. For each substring, check if all characters (digits) occur the same number of times. You can achieve this by maintaining a frequency count of each digit in the substring.
  3. Use a data structure such as a set to store and ensure that only unique substrings are counted, avoiding duplicates caused by the same substrings occurring in multiple positions.
  4. Efficiently update the frequency counts as you expand the substring during iteration to reduce the overall computational complexity.

Given the constraints:

  • Maximum length of s is 1000, making a direct approach feasible but potentially inefficient for larger strings.
  • Since s only consists of digits, the frequency count will be limited to 10 possible digits (0 through 9), facilitating quicker checks and balances.

Through careful implementation leveraging substring frequency analyses and maintaining uniqueness with appropriate data structures, we can determine the number of qualifying unique substrings efficiently.

Solutions

  • C++
cpp
class Solution {
public:
    int countEqualDigitFrequency(const std::string& str) {
        TrieNode* rootNode = new TrieNode();  // Root of Trie
        int count = 0;
    
        // Iterate starting points of substrings
        for (int i = 0; i < str.size(); i++) {
            TrieNode* current = rootNode;
            vector<int> freq(10, 0);
            int uniqueCount = 0;
            int maxFreq = 0;
    
            // Expand ending points of substrings
            for (int j = i; j < str.size(); j++) {
                int digit = str[j] - '0';  // Extract current digit
    
                // Manage frequency of current digit
                if (freq[digit]++ == 0) {
                    uniqueCount++;
                }
    
                maxFreq =
                    max(maxFreq, freq[digit]);
    
                // Extend or create Trie node
                if (current->links[digit] == nullptr) {
                    current->links[digit] = new TrieNode();
                }
    
                // Move to the node
                current = current->links[digit];
    
                // Validate substring
                if (uniqueCount * maxFreq == j - i + 1 && !current->seen) {
                    count++;
                    current->seen = true;
                }
            }
        }
    
        delete rootNode;  // Free memory
        return count;
    }
    
private:
    class TrieNode {
    public:
        TrieNode* links[10]; // Trie links for digits 0-9
        bool seen;
    
        TrieNode() : seen(false) {
            fill(begin(links), end(links), nullptr);
        }
    };
};

This C++ solution utilizes a trie to count unique substrings where all digits appear with the same frequency in the string. Below outlines the approach taken:

  1. Define a TrieNode class with links for digits 0-9 and a boolean flag seen which indicates if a node has been processed.
  2. In the countEqualDigitFrequency function:
    • A new TrieNode rootNode is initialized, starting as the root of the trie.
    • Initialize a counter count for matching substrings.
    • Utilize two nested loops to explore all possible substrings of the input string str.
    • For each starting index i, reset the frequency count of digits (0-9) and define variables for tracking unique digit count and maximum frequency observed.
    • For each ending index j, extract the digit from position j, update its frequency, and check if the unique digit count multiplied by maximum frequency equals the length of the current substring—suggesting all digits occur with equal frequency.
    • If the substring meets criteria and hasn't been marked seen in the trie, increment count and mark it as seen.
    • If a trie link for the current digit doesn't exist, create a new TrieNode and move to this node.
    • Continue until all substrings are processed.
    • Memory cleanup for rootNode is ensured after processing all substrings.

This method effectively counts substrings using a trie structure to avoid recounting substrings with the same properties, leveraging memory efficiency and improving runtime by avoiding redundancy.

  • Java
java
class Solution {
    public int calculateUniqueSubstring(String str) {
        TrieNode rootNode = new TrieNode(); // Initialize root node for Trie
        int totalSubstrings = 0;
    
        // Move through each possible starting point of substrings in string
        for (int i = 0; i < str.length(); i++) {
            TrieNode currentNode = rootNode;
            int[] freqCount = new int[10];
            int uniqueCount = 0;
            int maxFreq = 0;
    
            // Exploring substring that begins at index i
            for (int j = i; j < str.length(); j++) {
                int digit = str.charAt(j) - '0';
    
                // Updating frequency of current digit
                if (freqCount[digit]++ == 0) uniqueCount++;
                maxFreq = Math.max(maxFreq, freqCount[digit]);
    
                // Create new child node if not already existing
                if (currentNode.childNodes[digit] == null) {
                    currentNode.childNodes[digit] = new TrieNode();
                }
                currentNode = currentNode.childNodes[digit]; // Navigate to child node
    
                // Check conditions for a valid substring
                if (uniqueCount * maxFreq == j - i + 1 && !currentNode.seen) {
                    totalSubstrings++;
                    currentNode.seen = true; // Mark the node indicating substring has been counted
                }
            }
        }
        return totalSubstrings;
    }
    
    class TrieNode {
        TrieNode[] childNodes = new TrieNode[10]; // Nodes for each digit (0-9)
        boolean seen; // Flag to check if a given node is counted
    }
}

The provided Java solution aims to identify and count all unique substrings within a given string where each digit can be hashed to the frequency of its occurrence equally. Implement a Trie structure to optimize checking the uniqueness of the substrings processed.

Here’s a breakdown of what happens in the solution:

  • Initialize a root node for the Trie structure which is crucial for storing and navigating substring combinations.
  • Loop through each starting point in the input string to explore potential substrings.
  • For each starting point:
    • Initialize a TrieNode pointing to the root and an array to count the frequency of each digit.
    • Define variables to keep track of how many unique digits and the maximum frequency of any digit in the current processing substring.
    • For each character in the substring extending from the current starting point:
      • Update the frequency of the digit.
      • Manage Trie navigation, dynamically creating nodes if they don't exist, moving deeper with each character processed.
      • Verify if all unique digits have the same frequency, ensuring the product of unique count and maximum frequency equals the length of the current substring, and the substring hasn't been counted before.
  • If all conditions are satisfied, increment the count of unique substrings and mark the node as seen to prevent recounting.
  • Finally, return the count of unique valid substrings for the entire input string.

This method efficiently processes the input by using the Trie to both keep track of substring processed and quickly verify new substrings against previously encountered ones, enabling an optimized approach to the problem of counting unique substrings with uniform digit frequencies.

  • Python
python
class Solution:
    def subStringScore(self, s: str) -> int:
        trie_root = self.Node()  # Initializing the root of Trie
        count_valids = 0  # Stores the count of valid substrings
    
        # Iterate over all starting points in the string 's'
        for start_index in range(len(s)):
            current_trie_node = trie_root
            digit_counts = [0] * 10  # Holds frequency of each digit
            count_unique_digits = 0
            highest_digit_freq = 0
    
            # Explore all possible ending points extending from 'start_index'
            for end_index in range(start_index, len(s)):
                # Convert character to digit
                digit = int(s[end_index])
    
                # Update frequency of this digit
                if digit_counts[digit] == 0:
                    count_unique_digits += 1
                digit_counts[digit] += 1
                highest_digit_freq = max(highest_digit_freq, digit_counts[digit])
    
                # Create new TrieNode if it does not exist
                if not current_trie_node.next[digit]:
                    current_trie_node.next[digit] = self.Node()
                # Traverse to the next Node in Trie
                current_trie_node = current_trie_node.next[digit]
    
                # Check the validity of the current substring
                if (
                    count_unique_digits * highest_digit_freq == end_index - start_index + 1
                    and not current_trie_node.visited
                ):
                    # Increment the valid substring count
                    count_valids += 1
                    # Mark this substring as visited in trie
                    current_trie_node.visited = True
    
        return count_valids
    
    class Node:
        def __init__(self):
            self.next = [None] * 10  # Child nodes for digits 0-9
            self.visited = False  # Whether this node represents a visited substring

Below, find the solution explanation for the problem of identifying unique substrings within a string such that each substring contains digits with equal frequency:

  1. Initiate a class Solution with a method subStringScore which takes a string s and returns the count of valid substrings.

  2. Inside subStringScore, initialize a Trie data structure using a nested class Node, which includes:

    • An array next of size 10 for child nodes representing digits 0-9.
    • A boolean visited to check if this substring has been processed.
  3. Initialize count_valids to keep track of the number of valid substrings.

  4. Loop over each possible starting index of the substring in string s.

  5. For each starting index:

    • Initialize current_trie_node to the root of the Trie.
    • Keep an array digit_counts of size 10 to maintain the frequency of each digit in the current substring.
    • Initialize count_unique_digits to count the distinct digits and highest_digit_freq to store the maximum frequency of any digit in the current substring.
  6. Expand the current substring by iterating over every possible ending index from the current start. Convert the current character to a digit and update its frequency in digit_counts.

  7. Update Trie:

    • If there isn’t a Trie node for the digit, create one.
    • Move to this node for further processing.
  8. Validate the substring:

    • Check if the product of the number of unique digits and the highest digit frequency matches the length of the substring.
    • Ensure this substring hasn't been marked as visited in the Trie.
  9. If the substring is valid and not visited, increment the count_valids and mark it as visited in the Trie to prevent duplicate counting.

  10. Return count_valids as the final count of unique valid substrings.

This solution leverages the Trie structure to avoid redundant processing and ensure that each unique valid substring is only counted once. The combination of iterating start and end indices with Trie navigation provides a clean method to determine substrings meeting the specified conditions efficiently.

Comments

No comments yet.