Stream of Characters

Updated on 25 June, 2025
Stream of Characters header image

Problem Statement

In this problem, you are tasked with designing an algorithm that processes a continuous stream of characters, evaluating if the ending segment (suffix) of the stream matches any string from a predefined list of strings known as words. Your function should dynamically assess each new character added to the stream, determining if a recent sequence of characters corresponds to a word in words.

To implement this, you need to define the class StreamChecker with the following functionalities:

  • StreamChecker(String[] words): A constructor that initializes the class with an array of strings words.
  • boolean query(char letter): A method that accepts a new character from the stream and checks if the current concatenation of characters ends with any word in words. This method should return true if such a match exists, otherwise false.

Examples

Example 1

Input:

["StreamChecker", "query", "query", "query", "query", "query", "query", "query", "query", "query", "query", "query", "query"]
[[["cd", "f", "kl"]], ["a"], ["b"], ["c"], ["d"], ["e"], ["f"], ["g"], ["h"], ["i"], ["j"], ["k"], ["l"]]

Output:

[null, false, false, false, true, false, true, false, false, false, false, false, true]

Explanation:

StreamChecker streamChecker = new StreamChecker(["cd", "f", "kl"]);
streamChecker.query("a"); // return false
streamChecker.query("b"); // return false
streamChecker.query("c"); // return false
streamChecker.query("d"); // return true, because 'cd' is in the wordlist
streamChecker.query("e"); // return false
streamChecker.query("f"); // return true, because 'f' is in the wordlist
streamChecker.query("g"); // return false
streamChecker.query("h"); // return false
streamChecker.query("i"); // return false
streamChecker.query("j"); // return false
streamChecker.query("k"); // return false
streamChecker.query("l"); // return true, because 'kl' is in the wordlist

Constraints

  • 1 <= words.length <= 2000
  • 1 <= words[i].length <= 200
  • words[i] consists of lowercase English letters
  • letter is a lowercase English letter
  • At most 4 * 10⁴ calls will be made to query

Approach and Intuition

To solve this problem efficiently, we combine a Trie data structure with an optimized stream handling strategy.

  1. Reverse Trie Construction:

    • Since we are matching suffixes of the stream, we insert each word in words in reverse order into a Trie.
    • For example, the word "abc" is inserted as 'c' → 'b' → 'a'.
  2. Tracking the Stream:

    • Maintain a buffer (such as a list or deque) of the most recent characters received from query.
    • The length of this buffer never needs to exceed the length of the longest word in words.
  3. Efficient Querying:

    • Each time a new letter is received via query, add it to the front of the buffer.
    • Starting from the first character in the buffer, attempt to traverse the Trie.
    • If a terminal node is reached during traversal, it means a word has been matched—return true.
    • If traversal breaks before reaching a terminal node or fails altogether, return false.
  4. Performance Considerations:

    • Time complexity per query is proportional to the length of the longest word.
    • Since we only store the latest characters up to the maximum word length, space complexity remains bounded.

This strategy guarantees that even with up to 40,000 queries and 2,000 words, the program remains efficient and responsive.

Solutions

  • Java
  • Python
java
class TrieNode {
    Map<Character, TrieNode> map = new HashMap<>();
    boolean endOfWord = false;
}

class StreamChecker {
    TrieNode root = new TrieNode();
    Deque<Character> history = new LinkedList<>();

    public StreamChecker(String[] words) {
        for (String word : words) {
            TrieNode node = root;
            String reversed = new StringBuilder(word).reverse().toString();
            for (char character : reversed.toCharArray()) {
                if (!node.map.containsKey(character)) {
                    node.map.put(character, new TrieNode());
                }
                node = node.map.get(character);
            }
            node.endOfWord = true;
        }
    }
    
    public boolean query(char letter) {
        history.addFirst(letter);
        
        TrieNode node = root;
        for (char character : history) {
            if (node.endOfWord) {
                return true;
            }
            if (!node.map.containsKey(character)) {
                return false;
            }
            node = node.map.get(character);
        }
        return node.endOfWord;
    }
}

The goal of the provided Java solution is to determine if any substring of the received characters matches with any word out of the provided list using a Trie and a stream processing approach.

  • A TrieNode class is defined where:

    • map holds the children nodes.
    • endOfWord indicates the end of a word.
  • The StreamChecker class is responsible for:

    1. Initializing the Trie based on the reversed words to simplify suffix checking. This is done in the StreamChecker constructor where each input word is reversed and inserted into the Trie.
    2. Checking the incoming characters against the Trie dynamically. This operation is performed in the query method by adding the new character to the front of a history deque which represents the stream of characters received so far.
  • The query method processes each incoming letter by:

    1. Adding each character to a history.
    2. Iteratively checking each character from the deque against the Trie.
    3. Returning true if a substring of the characters corresponds to a word end in the Trie.

This setup allows efficient checking of substrings against a predefined set of words by leveraging the Trie’s compact structure and quick lookup capabilities. The use of history to keep recent characters and the Trie construction for reversed words are critical for ensuring that the stream processing is both effective and efficient.

python
class WordStreamChecker:

    def __init__(self, words: List[str]):
        self.root = {}
        self.incomingChars = deque([])

        for word in set(words):
            currentNode = self.root     
            for character in reversed(word):
                if character not in currentNode:
                    currentNode[character] = {}
                currentNode = currentNode[character]
            currentNode['`'] = word
        
    def checkQuery(self, character: str) -> bool:
        self.incomingChars.appendleft(character)
        
        currentNode = self.root
        for char in self.incomingChars:
            if '`' in currentNode:
                return True
            if char not in currentNode:
                return False
            currentNode = currentNode[char]
        return '$' in currentNode

The given Python script defines a class WordStreamChecker to manage a stream of characters and check for the presence of specific words within this stream. Here's a concise explanation of how the implementation accomplishes this:

  • Class Initialization (__init__ method):

    • Initializes a trie data structure to efficiently store and access words.
    • The words argument, a list of strings, is converted into a unique set to avoid duplicates and then stored in a trie in reverse order. This reverse storage is crucial as the stream of characters will be evaluated from most recent to older entries.
    • A special marker '`' is added at the end node of each word in the trie to signify the completion of a word.
  • Character Stream Handling (checkQuery method):

    • Receives a character and adds it to a deque which maintains the current stream of characters.
    • It then checks, starting from the most recent character added, whether a sequence of these characters exists in the trie and forms a word that was originally input.

This setup allows the caller to continuously feed characters into the checkQuery method and receive immediate feedback on whether the recent sequence of characters has formed a complete word that exists in the initial list. It effectively handles multiple checks efficiently by leveraging the trie's properties and the deque's fast appends and pops.

Comments

No comments yet.