Design Compressed String Iterator

Updated on 27 May, 2025
Design Compressed String Iterator header image

Problem Statement

This article focuses on designing a data structure called StringIterator for navigating through a specially formatted compressed string. In this format, each character of a string is followed by a positive integer indicating the number of times the character appears consecutively in the uncompressed version of the string. The StringIterator class provides two primary methods:

  1. next(): This method returns the next character from the uncompressed version of the string. If all characters are exhausted, it returns a white space.
  2. hasNext(): This function checks whether there are more characters to be returned (i.e., the string has not been completely uncompressed yet) and returns a boolean value accordingly.

These functionalities are essential for processing and iterating over compressed strings without actually decompressing them entirely, optimizing both time and space complexity.

Examples

Example 1

Input:

["StringIterator", "next", "next", "next", "next", "next", "next", "hasNext", "next", "hasNext"]
[["L1e2t1C1o1d1e1"], [], [], [], [], [], [], [], [], []]

Output:

[null, "L", "e", "e", "t", "C", "o", true, "d", true]

Explanation:

StringIterator stringIterator = new StringIterator("L1e2t1C1o1d1e1");
stringIterator.next(); // return "L"
stringIterator.next(); // return "e"
stringIterator.next(); // return "e"
stringIterator.next(); // return "t"
stringIterator.next(); // return "C"
stringIterator.next(); // return "o"
stringIterator.hasNext(); // return True
stringIterator.next(); // return "d"
stringIterator.hasNext(); // return True

Constraints

  • 1 <= compressedString.length <= 1000
  • compressedString consists of lower-case an upper-case English letters and digits.
  • The number of a single character repetitions in compressedString is in the range [1, 10^9]
  • At most 100 calls will be made to next and hasNext.

Approach and Intuition

The StringIterator class must efficiently handle potentially long sequences of repetitive characters as indicated by large count values (up to 10^9) without physically expanding these sequences. Here's the proposed approach and intuition behind the functioning of StringIterator based on examples and constraints provided:

  1. Initialization and Parsing:

    • During the initialization of the iterator with a compressed string, the constructor needs to parse this string. The string consists of characters followed by their respective counts. Parsing can be efficiently managed using a loop that extracts characters and their corresponding repetition counts into a structured format, such as pairs in a list.
  2. next() Implementation:

    • This function should maintain a counter for how many times the current character needs to be returned before moving on to the next character.
    • If the counter hits zero, indicating all instances of the current character have been iterated over, the function should advance to the next character in the sequence and reset the counter with its count.
    • If there's no next character (all characters are exhausted), it should return a white space.
  3. hasNext() Implementation:

    • This method simply checks whether there are more characters to output. This could be as simple as checking if there's any character left in the sequence or if the current character's counter has not reached zero.

The key here is efficiently managing the transition between different characters and their counts without expanding the entire string, leveraging the structured parsing done during initialization. This will enable the next() and hasNext() methods to operate in constant time, making the iterator efficient even for very large strings.

Solutions

  • Java
java
public class CharSequenceIterator {
    String sequence;
    int index = 0, count = 0;
    char currentChar = ' ';
    public CharSequenceIterator(String str) {
        sequence = str;
    }
    public char next() {
        if (!hasNext())
            return ' ';
        if (count == 0) {
            currentChar = sequence.charAt(index++);
            while (index < sequence.length() && Character.isDigit(sequence.charAt(index))) {
                count = count * 10 + sequence.charAt(index++) - '0';
            }
        }
        count--;
        return currentChar;
    }
    public boolean hasNext() {
        return index != sequence.length() || count != 0;
    }
}

The solution presents a Java program for a compressed string iterator, which operates on a string where characters followed by one or more digits represent the character repeated the number of times specified by the digits.

  • Initialize the CharSequenceIterator with a compressed string. This string is stored in a member variable sequence.
  • Define member variables index, count, and currentChar. index tracks the current position in the string, count keeps track of how many times the current character needs to be returned, and currentChar stores the current character to return.
  • The next() method provides the next character in the sequence:
    • First, check if there are any characters left to return using hasNext(). If none are left, return a space character as a default.
    • If count is zero, the next character and its count need to be parsed. Increment index to start reading digits and calculate the count using a loop that multiplies the current count by 10 and adds the digit value, repeating until no digits follow the character.
    • Decrease count by one and return the currentChar.
  • The hasNext() method checks if there are more elements available in the string:
    • Return true if either index hasn't reached the end of the string or count is not zero, indicating that there are still characters or repetitions left to process.

This iterator effectively handles the traversal and expansion of compressed sequences in a string, using integer calculations and character operations to unpack and serve each character on demand.

Comments

No comments yet.