Number of Nodes in the Sub-Tree With the Same Label

Updated on 11 July, 2025
Number of Nodes in the Sub-Tree With the Same Label header image

Problem Statement

In this problem, we are provided with a tree structure, which is a specific kind of connected, undirected graph without any cycles. The tree is made up of n nodes that are numbered from 0 to n - 1. The structure of the tree is defined by n - 1 edges, each connecting two nodes ai and bi. The node 0 serves as the root of this tree. Every node in the tree is assigned a lower-case character label, as specified in the given string labels, where the label for the node numbered i is the ith character in the string labels.

The task is to compute the size of the subtree for each node with all nodes having the same label as the node itself. The subtree of a node includes the node and all of its descendant nodes. The output should be an array of size n, where each element i in the array represents the number of nodes in the subtree rooted at node i that share the same label as node i.

Examples

Example 1

Input:

n = 7, edges = [[0,1],[0,2],[1,4],[1,5],[2,3],[2,6]], labels = "abaedcd"

Output:

[2,1,1,1,1,1,1]

Explanation:

Node 0 has label 'a' and its sub-tree has node 2 with label 'a' as well, thus the answer is 2. Notice that any node is part of its sub-tree.
Node 1 has a label 'b'. The sub-tree of node 1 contains nodes 1,4 and 5, as nodes 4 and 5 have different labels than node 1, the answer is just 1 (the node itself).

Example 2

Input:

n = 4, edges = [[0,1],[1,2],[0,3]], labels = "bbbb"

Output:

[4,2,1,1]

Explanation:

The sub-tree of node 2 contains only node 2, so the answer is 1.
The sub-tree of node 3 contains only node 3, so the answer is 1.
The sub-tree of node 1 contains nodes 1 and 2, both have label 'b', thus the answer is 2.
The sub-tree of node 0 contains nodes 0, 1, 2 and 3, all with label 'b', thus the answer is 4.

Example 3

Input:

n = 5, edges = [[0,1],[0,2],[1,3],[0,4]], labels = "aabab"

Output:

[3,2,1,1,1]

Constraints

  • 1 <= n <= 105
  • edges.length == n - 1
  • edges[i].length == 2
  • 0 <= ai, bi < n
  • ai != bi
  • labels.length == n
  • labels is consisting of only of lowercase English letters.

Approach and Intuition

Based on the problem and examples provided, we can derive an approach to solving this task. Our target is to compute the occurrence of labels for each node in its respective subtree. Let's break down the approach:

  1. Tree Initialization and Input Parsing:

    • Parse nodes and edges to construct a tree structure. This can be achieved using an adjacency list representation where each node points to its connecting nodes.
    • Initialize an array result of size n to hold the count of labels for each node's subtree.
  2. Depth-First Search (DFS):

    • Use a DFS approach starting from the root node (node 0). For each node:
      • Initialize a frequency counter for labels.
      • Traverse to its children recursively.
      • While returning (postorder traversal), aggregate the label counts of child nodes to the parent node.
      • Update the result array for each node with the count of its label in its subtree.
  3. Counting and Aggregating Labels:

    • During DFS, maintain a frequency count of labels up to the current node. Each time a node is visited, its label should be counted.
    • Any count of labels gathered from the children should be added to the parent node's corresponding label count.
  4. Updating Result on Each Visit:

    • After visiting all children of a node and before the DFS unwinds back to the parent node, record the frequency of the current node's label into the result for that node.
  5. Handling Edge Cases:

    • Nodes with no children will simply have a count of 1 for their label, as they are their own subtree.
    • Verify that the tree is valid and the input constraints are respected (like tree connectivity and label characters being lowercase).

This approach efficiently computes the label occurrences using tree traversals and can handle the input constraints given, particularly the larger values for n. It leverages the natural hierarchy of trees to aggregate and propagate label counts from leaf nodes up through the tree to the root.

Solutions

  • C++
cpp
class Solution {
public:
    vector<int> computeSubtreeLabels(int nodeCount, vector<vector<int>>& connections, string nodeLabels) {
        unordered_map<int, unordered_set<int>> graph;
        for (auto& connection : connections) {
            graph[connection[0]].insert(connection[1]);
            graph[connection[1]].insert(connection[0]);
        }
    
        vector<vector<int>> labelCounts(nodeCount, vector<int>(26));
        queue<int> nodesQueue;
    
        for (int node = 0; node < nodeCount; ++node) {
            labelCounts[node][nodeLabels[node] - 'a'] = 1;
            if (node != 0 && graph[node].size() == 1) {
                nodesQueue.push(node);
            }
        }
    
        while (!nodesQueue.empty()) {
            int current = nodesQueue.front();
            nodesQueue.pop();
    
            int parentNode = *graph[current].begin();
            graph[parentNode].erase(current);
    
            for (int i = 0; i < 26; ++i) {
                labelCounts[parentNode][i] += labelCounts[current][i];
            }
    
            if (parentNode != 0 && graph[parentNode].size() == 1) {
                nodesQueue.push(parentNode);
            }
        }
    
        vector<int> results(nodeCount);
        for (int node = 0; node < nodeCount; ++node) {
            results[node] = labelCounts[node][nodeLabels[node] - 'a'];
        }
    
        return results;
    }
};

This solution implements a function computeSubtreeLabels in the C++ programming language to count labels within the subtrees of a given tree graph and return those counts. Each node in the tree corresponds to a character from a node label string, and the goal is to count how many times the character for each node appears in its subtree.

The two main inputs to the function are:

  • nodeCount representing the number of nodes.
  • connections which is a vector of vector representing the edges between nodes.
  • nodeLabels a string where each character represents the label of a node.

The solution involves several steps:

  1. A graph is constructed using an unordered_map, where each node points to its neighboring nodes, represented as an unordered_set.
  2. A labelCounts vector is initialized to store the counts of each label (from 'a' to 'z', hence the inner vector has size 26) for every node.
  3. A queue (nodesQueue) is initialized to process nodes in a level-order fashion starting from the leaf nodes, moving towards the root (node labeled with index 0).
  4. Leaf nodes (Except the root node) are added to the queue to start the process.
  5. While processing each node, the function aggregates the label counts from the node to its parent. After a node is processed, it is removed from its parent's adjacency list to prevent re-processing.
  6. This aggregation continues until all nodes are processed. At the end of this method, each parent node contains the summed counts of labels in its subtree including itself.
  7. Finally, for each node, the count of its own label within its subtree is extracted and added to the result list, which is returned.

In summation, the function computes how many times the label of each node appears in its subtree by leveraging BFS, graph adjacency representation, and careful tallying of character counts, finally aggregating these counts into a result array that is returned. The solution optimally handles the tree structure through decomposition and accumulation of counts.

  • Java
java
class Solution {
    public int[] subtreeCount(int nodes, int[][] connections, String characters) {
        Map<Integer, Set<Integer>> graph = new HashMap<>();
        for (int[] edge : connections) {
            int node1 = edge[0], node2 = edge[1];
            graph.computeIfAbsent(node1, x -> new HashSet<>()).add(node2);
            graph.computeIfAbsent(node2, x -> new HashSet<>()).add(node1);
        }
    
        int[][] labelCounts = new int[nodes][26];
        Queue<Integer> processQueue = new LinkedList<>();
    
        for (int index = 0; index < nodes; index++) {
            labelCounts[index][characters.charAt(index) - 'a'] = 1;
            if (index != 0 && graph.get(index).size() == 1) {
                processQueue.offer(index);
            }
        }
    
        while (!processQueue.isEmpty()) {
            int currentNode = processQueue.poll();
            int parentNode = graph.get(currentNode).iterator().next();
    
            graph.get(parentNode).remove(currentNode);
    
            for (int j = 0; j < 26; j++) {
                labelCounts[parentNode][j] += labelCounts[currentNode][j];
            }
    
            if (parentNode != 0 && graph.get(parentNode).size() == 1) {
                processQueue.offer(parentNode);
            }
        }
    
        int[] result = new int[nodes];
        for (int i = 0; i < nodes; i++) {
            result[i] = labelCounts[i][characters.charAt(i) - 'a'];
        }
        return result;
    }
}

This Java solution addresses the problem of counting the number of nodes in each sub-tree that have the same label. Follow this detailed breakdown for understanding and implementing the code:

  • Initialize a graph using a HashMap to represent the connections between nodes.

  • Populate this graph from the given connections array. Each node is mapped to a set of its connected nodes, ensuring bidirectional adjacency.

  • Use an integer matrix labelCounts to keep track of the frequency of each label (from 'a' to 'z') in the sub-trees of each node.

  • Initialize this matrix such that each node initially counts only its own label (indexed by converting the character to its respective position in the alphabet).

  • Employ a Queue to manage nodes in a level-order fashion starting primarily from the leaf nodes (nodes that, apart from the root node, connect to only one other node).

  • Process each node from the queue:

    • Remove the node from its parent's set of children in the graph.
    • Update the parent node's label count by adding the label counts of the current node.
    • If the parent node becomes a leaf node, add it to the queue for subsequent processing.
  • After processing all nodes, prepare the result array where each entry is derived from labelCounts for the corresponding node's label.

  • Return the result array where each element represents the count of nodes in the respective node's sub-tree having the same label as the node itself.

This solution effectively utilizes graph representation, BFS, and careful counting using an adjacency matrix for label frequencies. Ensure understanding each part of the process, from graph construction to iterative sub-tree processing, for successful implementation.

Comments

No comments yet.