Data Stream as Disjoint Intervals

Updated on 16 May, 2025
Data Stream as Disjoint Intervals header image

Problem Statement

The goal is to design a data structure for a dynamic system that receives a flow of non-negative integers and can summarise the data in real time as a collection of disjoint intervals. The implementation focuses on building the SummaryRanges class with functionalities to add numbers to the stream and retrieve a summary of the stream as ordered, non-overlapping intervals.

This class provides:

  • A constructor SummaryRanges() to initialize the system without any data.
  • A method addNum(int value) to insert a new integer into the stream.
  • A method getIntervals() that returns the summary of all integers input so far in the format of disjoint intervals [starti, endi], sorted by the start of each interval.

Examples

Example 1

Input:

["SummaryRanges", "addNum", "getIntervals", "addNum", "getIntervals", "addNum", "getIntervals", "addNum", "getIntervals", "addNum", "getIntervals"]
[[], [1], [], [3], [], [7], [], [2], [], [6], []]

Output:

[null, null, [[1, 1]], null, [[1, 1], [3, 3]], null, [[1, 1], [3, 3], [7, 7]], null, [[1, 3], [7, 7]], null, [[1, 3], [6, 7]]]

Explanation:

SummaryRanges summaryRanges = new SummaryRanges();
summaryRanges.addNum(1); // arr = [1]
summaryRanges.getIntervals(); // return [[1, 1]]
summaryRanges.addNum(3); // arr = [1, 3]
summaryRanges.getIntervals(); // return [[1, 1], [3, 3]]
summaryRanges.addNum(7); // arr = [1, 3, 7]
summaryRanges.getIntervals(); // return [[1, 1], [3, 3], [7, 7]]
summaryRanges.addNum(2); // arr = [1, 2, 3, 7]
summaryRanges.getIntervals(); // return [[1, 3], [7, 7]]
summaryRanges.addNum(6); // arr = [1, 2, 3, 6, 7]
summaryRanges.getIntervals(); // return [[1, 3], [6, 7]]

Constraints

  • 0 <= value <= 104
  • At most 3 * 104 calls will be made to addNum and getIntervals.
  • At most 102 calls will be made to getIntervals.

Approach and Intuition

  1. Stream Initialization

    • Through the constructor SummaryRanges(), the object is initialized. Typically, you might use a dynamic data structure such as a list or a set to handle the incoming data stream and maintain its state.
  2. Adding Numbers to the Stream

    • The addNum(int value) method involves adding a number to the data structure. The main challenge here is ensuring that the addition does not disrupt the ability to efficiently generate intervals. One efficient way to handle this could be to use a tree structure or a sorted list to maintain order and facilitate the merging of intervals.
  3. Generating and Retrieving Intervals

    • The getIntervals() method retrieves a summary of the data as disjoint intervals. After adding each number, you would need to check if this number bridges the gap between two existing intervals or extends an existing interval. If neither, it becomes an interval by itself.
    • To efficiently merge intervals, you can use an insertion position search (like binary search when using a sorted structure). If the added number fits within or extends an interval, modify the interval; otherwise, add a new interval.
    • Repeat interval checks and merges until no further consolidation is possible.
  4. Maintaining Order and Efficiency

    • By keeping the stream's data in a sorted order and judiciously merging intervals whenever possible, the retrieval of intervals remains efficient. This ensures that each operation is done in a reasonable time, considering frequent addNum calls and relatively less frequent getIntervals calls as outlined in the constraints.

In essence, the data structure should support the dynamism of a streaming system with a focus on input order, efficient data summarization into intervals, and maintaining quick access and updates. This approach would also respect the given constraints on the number of operations and allowable input ranges.

Solutions

  • C++
  • Java
cpp
class IntervalManager {
    map<int, int> ranges;

public:
    IntervalManager() {}

    void insertNumber(int val) {
        int start = val, end = val;
        auto next = ranges.upper_bound(val);
        if (next != ranges.begin()) {
            auto prev = next;
            --prev;
            if (prev->second >= val) {
                return;
            }
            if (prev->second == val - 1) {
                start = prev->first;
            }
        }
        if (next != ranges.end() && next->first == val + 1) {
            end = next->second;
            ranges.erase(next);
        }
        ranges[start] = end;
    }

    vector<vector<int>> getRanges() {
        vector<vector<int>> result;
        for (const auto& interval : ranges) {
            result.push_back({interval.first, interval.second});
        }
        return result;
    }
};

The provided C++ solution defines a class IntervalManager that maintains a set of non-overlapping intervals modified by a stream of numbers. The class employs an std::map to intelligently store the intervals, ensuring efficient operations and minimal storage overhead. Here's a breakdown of how the solution manages these intervals:

  • Constructor Initialization: Initializes an empty IntervalManager with no intervals.

  • Inserting Numbers:

    • insertNumber method takes an integer, val, and determines the potential start and end of a new interval.
    • It finds the position in the map where this number would fit (next iterator).
    • If val can merge with the interval on the left, it adjusts the start.
    • If val can extend the interval on the right, it adjusts the end and removes the redundant entry from the map.
    • The new or adjusted interval is then stored or updated in the map.
  • Retrieving Intervals:

    • getRanges method compiles the current set of intervals from the map into a vector<vector<int>> format suitable for external use.
    • Iterates through the map and pushes each interval into the resulting vector.

This structure and the associated algorithms ensure efficient management and querying of intervals, providing an accurate representation of disjoint intervals affected by incoming data. The use of a map allows for combined operations (insertion and merging) to be handled in logarithmic time relative to the number of distinct intervals.

java
class RangeModule {
    private TreeMap<Integer, Integer> ranges;

    public RangeModule() {
        ranges = new TreeMap<>();
    }
    
    public void addRange(int value) {
        Map.Entry<Integer, Integer> low = ranges.floorEntry(value);
        int start = value, end = value;
        if (low != null) {
            int extendLeft = low.getValue();
            if (extendLeft >= value) return;
            if (extendLeft == value - 1) start = low.getKey();
        }
        Map.Entry<Integer, Integer> high = ranges.higherEntry(value);
        if (high != null && high.getKey() == value + 1) {
            end = high.getValue();
            ranges.remove(value + 1);
        }
        ranges.put(start, end);
    }
    
    public int[][] getRanges() {
        int[][] result = new int[ranges.size()][2];
        int idx = 0;
        for (Map.Entry<Integer, Integer> entry : ranges.entrySet()) {
            result[idx][0] = entry.getKey();
            result[idx++][1] = entry.getValue();
        }
        return result;
    }
}

In this Java solution for handling a data stream as disjoint intervals, a TreeMap<Integer, Integer> is employed to efficiently manage the intervals. The TreeMap stores ranges in an ordered manner where each key-value pair represents the start and the end of each range respectively.

  • Initialize a new RangeModule object with no intervals by creating an instance of TreeMap.
  • The addRange method takes a single integer, value, and adds it to the set of intervals:
    1. Search for the nearest range boundaries before and after value using floorEntry(value) and higherEntry(value).
    2. Adjust the start and end of the new interval if value bridges adjacent ranges or extends an existing range.
    3. Insert the new or updated range back into TreeMap, automatically keeping the map sorted.
  • getRanges method assembles the interval set into an array:
    1. Iterate over the entries of the TreeMap.
    2. Place each key (start of interval) and its corresponding value (end of interval) into the resulting 2D array.

This approach leverages the sorting and key boundary features of TreeMap to ensure that each range insertion operates efficiently, and that the range data can easily be retrieved in sorted order as a collection of disjoint intervals.

Comments

No comments yet.