Kth Largest Element in a Stream

Updated on 03 June, 2025
Kth Largest Element in a Stream header image

Problem Statement

In the context of university admissions, tracking the kth highest test score in real-time is crucial. This system helps dynamically adjust cut-off marks for interviews and admissions as new scores are submitted. The task involves developing a KthLargest class that continuously maintains a record of test scores and can return the kth largest score whenever a new score is added.

The KthLargest class has two primary functions:

  1. KthLargest(int k, int[] nums) — Initializes the class with an integer k and an array nums representing the stream of test scores.
  2. int add(int val) — Adds a new test score val to the stream and returns the updated kth largest score.

This system ensures that with every new score submitted, the kth largest score can be recalculated and retrieved efficiently.

Examples

Example 1

Input:

["KthLargest", "add", "add", "add", "add", "add"]
[[3, [4, 5, 8, 2]], [3], [5], [10], [9], [4]]

Output:

[null, 4, 5, 5, 8, 8]

Explanation:

KthLargest kthLargest = new KthLargest(3, [4, 5, 8, 2]);
kthLargest.add(3);  // returns 4
kthLargest.add(5);  // returns 5
kthLargest.add(10); // returns 5
kthLargest.add(9);  // returns 8
kthLargest.add(4);  // returns 8

Example 2

Input:

["KthLargest", "add", "add", "add", "add"]
[[4, [7, 7, 7, 7, 8, 3]], [2], [10], [9], [4]]

Output:

[null, 7, 7, 7, 8]

Explanation:

KthLargest kthLargest = new KthLargest(4, [7, 7, 7, 7, 8, 3]);
kthLargest.add(2);  // returns 7
kthLargest.add(10); // returns 7
kthLargest.add(9);  // returns 7
kthLargest.add(4);  // returns 8

Constraints

  • 0 <= nums.length <= 10^4
  • 1 <= k <= nums.length + 1
  • -10^4 <= nums[i], val <= 10^4
  • At most 10^4 calls will be made to add

Approach and Intuition

To efficiently maintain the kth largest value in a dynamic stream of scores:

  1. Use a Min-Heap (Priority Queue):

    • Keep the smallest value at the root.
    • Always maintain exactly k elements in the heap.
    • The root of the heap will always represent the kth largest element.
  2. Constructor (KthLargest(k, nums)):

    • Add elements from nums to the min-heap.
    • If the heap exceeds size k, remove the smallest element.
  3. Add Operation (add(val)):

    • If the heap has fewer than k elements, add the new value.
    • If the new value is larger than the smallest in the heap, remove the smallest and insert the new value.
    • Return the smallest element in the heap, which is the current kth largest.

Time Complexity:

  • Constructor: O(n log k) for inserting n elements into a size-k heap.
  • add() method: O(log k) per operation.

This design ensures all operations are efficient and scalable, suitable for real-time systems like admission score trackers.

Solutions

  • C++
  • Java
  • Python
cpp
class KthLargestElement {
private:
    priority_queue<int, vector<int>, greater<int>> minPriorityQueue;
    int kSize;

public:
    KthLargestElement(int k, vector<int>& initialElements) {
        kSize = k;
        for (int element : initialElements) {
            insert(element);
        }
    }

    int insert(int value) {
        if (minPriorityQueue.size() < kSize || minPriorityQueue.top() < value) {
            minPriorityQueue.push(value);
            if (minPriorityQueue.size() > kSize) {
                minPriorityQueue.pop();
            }
        }
        return minPriorityQueue.top();
    }
};

The problem requires implementing a class, KthLargestElement, that can efficiently track the k-th largest element in a running stream of numbers. The provided C++ code leverages a priority queue (min-heap) strategy to solve this problem efficiently. Below is a breakdown of how the solution works:

  • Define a private member minPriorityQueue of type priority_queue<int, vector<int>, greater<int>>. This structure automatically maintains the smallest element at the top, thus supporting quick access to the minimum of its elements.

  • Utilize an integer kSize as a private member to store the value of k, which represents the k-th position in the order of largest elements you want to track.

  • The constructor KthLargestElement(int k, vector<int>& initialElements) initializes the class object:

    • Set kSize with the value of k.
    • Iterate over initialElements, and for each element, invoke the insert method to add the element to the stream.
  • The insert function takes a new value, value, as an input:

    • If the size of the min-heap is less than k, or if the minimum element in the heap (minPriorityQueue.top()) is less than value, push value into the min-heap.
    • If pushing value causes the min-heap's size to exceed kSize, remove the smallest element (top of the heap), maintaining the heap size at k.
    • Return the current k-th largest element by returning the top element of the min-heap, which is the smallest element in the current top k largest elements.

This implementation maintains a running track of the k-th largest element with each insertion being conducted in logarithmic time relative to the size of k, making this approach suitable for handling streams or continuous input efficiently.

java
class KthLargestElement {

    PriorityQueue<Integer> priorityQueue;
    int heapSize;

    public KthLargestElement(int k, int[] array) {
        priorityQueue = new PriorityQueue<>();
        this.heapSize = k;

        for (int element : array) {
            add(element);
        }
    }

    public int add(int value) {
        if (priorityQueue.size() < heapSize || priorityQueue.peek() < value) {
            priorityQueue.add(value);
            if (priorityQueue.size() > heapSize) {
                priorityQueue.poll();
            }
        }
        return priorityQueue.peek();
    }
}

The solution involves implementing a class KthLargestElement in Java to find the kth largest element in a stream of numbers efficiently using a min-heap, which is represented by Java's PriorityQueue. The steps taken within the code are as follows:

  1. Create a private PriorityQueue<Integer> priorityQueue to store the smallest k elements encountered. This queue orders elements in ascending order by default.
  2. Define an integer heapSize to store the value of k, which represents the position of the largest element we are interested in.
  3. Construct the KthLargestElement with integer k and an integer array. The constructor initializes the heap by iterating through each element of the array and adding it to the stream using the add method.
  4. Implement the add method which accepts an integer value:
    • Check if the size of the priorityQueue is less than k or if the smallest element in the queue is less than the new value to decide whether to add the new value to the queue.
    • If added, and the queue size exceeds k, remove the smallest element, ensuring the queue always contains the k largest elements.
    • Return the root of the min-heap, which is now the kth largest element.

This method ensures the class efficiently handles continuous input stream updates while always being ready to return the kth largest element in O(log k) time due to the heap operations.

python
class KthLargestElement:
    def __init__(self, k: int, nums: List[int]):
        self.heap = []
        self.capacity = k
        for number in nums:
            self.insert(number)

    def insert(self, value: int) -> int:
        if len(self.heap) < self.capacity or self.heap[0] < value:
            heapq.heappush(self.heap, value)
            if len(self.heap) > self.capacity:
                heapq.heappop(self.heap)
        return self.heap[0]

Solution Summary for "Kth Largest Element in a Stream":

The solution involves implementing a class named KthLargestElement in Python, which manages a stream of numbers and allows retrieval of the kth largest element at any time. This is achieved using a min-heap data structure, which efficiently maintains the kth largest elements in the stream.

Here's a breakdown of the implementation:

  • An initializer method __init__ accepts two parameters—k, the rank of the largest element you wish to track, and nums, an initial list of integers. It sets up an empty heap and then inserts each number from nums using the insert method.
  • The insert method takes a new integer value, adds it to the heap if the heap holds fewer than k items, or if the new integer is larger than the smallest item in the heap. If adding a new integer exceeds the heap's capacity (i.e., k elements), it removes the smallest item, thereby maintaining only the largest k elements in the heap.
  • It always returns the smallest element in the heap, which is the kth largest element in the context of the nums processed so far.

This approach ensures that you can continually add elements and retrieve the kth largest element using operations that are logarithmic in time complexity relative to k, making it suitable for high-performance scenarios where the list's size can grow large.

  • Make sure the heapq module is imported for heap operations.

Comments

No comments yet.