
Problem Statement
The task is to create a class MedianFinder
that dynamically calculates the median of a continuously updating list of integers. The median is defined as the middle value in a list when sorted. If the list's length is even, the median is the average of the two middle numbers.
- When the integer list like
[2,3,4]
is provided, the median is the single middle value3
. - For an even number of elements, such as in
[2,3]
, the median becomes the average of the middle two numbers, resulting in a median of2.5
.
The MedianFinder
class should be able to:
- Initialize with the
MedianFinder()
constructor. - Add a new number with
addNum(int num)
. - Provide the current median of all added numbers with
findMedian()
, being within10-5
tolerance of the actual median.
Examples
Example 1
Input
["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"] [[], [1], [2], [], [3], []]
Output
[null, null, null, 1.5, null, 2.0]
Explanation
MedianFinder medianFinder = new MedianFinder(); medianFinder.addNum(1); // arr = [1] medianFinder.addNum(2); // arr = [1, 2] medianFinder.findMedian(); // return 1.5 (i.e., (1 + 2) / 2) medianFinder.addNum(3); // arr[1, 2, 3] medianFinder.findMedian(); // return 2.0
Constraints
-105 <= num <= 105
- There will be at least one element in the data structure before calling
findMedian
. - At most
5 * 104
calls will be made toaddNum
andfindMedian
.
Approach and Intuition
The main challenge in implementing the MedianFinder
class lies in efficiently managing an unsorted stream of data and finding the median in constant or logarithmic time as data accumulates. The class should perform the following operations:
The
MedianFinder()
constructor initializes the data structure.addNum(int num)
:- This method handles the addition of each new number
num
into the structure. - To efficiently retrieve the median, we can use two heaps (a max-heap to store the smaller half of the numbers and a min-heap for the larger half). This allows us to balance the heaps whenever a new number is added such that:
- The max-heap contains the smaller half with the largest values accessible in constant time.
- The min-heap contains the larger half, again allowing constant time access to the smallest values of this half.
- This method handles the addition of each new number
findMedian()
:- If the total count of numbers is odd, the median is located at the top of the larger heap (depending on which heap has more elements).
- If the count is even, the median is the average of the roots of both heaps.
- Given the structure of the heaps, access to these roots is constant time.
Example
Let's consider ["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"]
with inputs [[], [1], [2], [], [3], []]
. The operations would proceed as follows:
- A new
MedianFinder
object is instantiated. 1
is added to the data structure. Since it's the first element, it is alone in the max-heap.2
is added next. Given our two-heaps approach,1
stays in the max-heap, and2
moves to the min-heap.- Calling
findMedian()
at this stage computes the median of[1, 2]
which is1.5
. - After adding
3
, the heaps adjust to keep the balance: max-heap[1]
, min-heap[2, 3]
. - The next
findMedian()
call calculates the median from[1, 2, 3]
, which is the root of the min-heap, resulting in2.0
.
Notes on Constraints and Edge Cases
The constraints ensure that:
- The stream will always have at least one element when
findMedian
is called, preventing queries on an empty dataset. - The range of
num
is capped at±105
, and the number of operations is bounded by5 * 104
. The heap-based approach is efficient enough to handle this boundary within acceptable computational limits.
Solutions
- C++
class MedianCalculator {
multiset<int> numbers;
multiset<int>::iterator medianIterator;
public:
MedianCalculator()
: medianIterator(numbers.end())
{
}
void insertNum(int value)
{
const int count = numbers.size();
numbers.insert(value);
if (!count) // inserting the first element
medianIterator = numbers.begin();
else if (value < *medianIterator) // insert value affects the lower half
medianIterator = (count % 2 ? medianIterator : prev(medianIterator));
else // insert value affects the upper half
medianIterator = (count % 2 ? next(medianIterator) : medianIterator);
}
double computeMedian()
{
const int size = numbers.size();
return ((double) *medianIterator + *next(medianIterator, size % 2 - 1)) * 0.5;
}
};
The provided C++ code defines a class MedianCalculator
that efficiently calculates the median of a dynamically changing data stream using a multiset to store the numbers and iterators for median management. Here's a concise explanation of how each part functions and the significance of the operations within:
Initialization:
- The constructor initializes
medianIterator
to the end of an empty multiset, preparing it for the incoming data.
- The constructor initializes
Inserting a Number:
- The
insertNum
method adds a new number to the multiset and adjusts themedianIterator
accordingly. - When inserting into an empty data structure, the iterator is set to the beginning.
- When inserting a number that affects the lower half of the multiset (i.e., the number is less than the median), the iterator may either stay in place or move back one position, depending on the set's current size.
- For numbers affecting the upper half (i.e., equal to or greater than the median), the iterator may move forward one position or stay put, dependent on the odd or even count of the set.
- The
Computing the Median:
- The
computeMedian
method determines the median by comparing the size of the multiset. - If the set size is odd, it directly returns the value at
medianIterator
. - For even-sized sets, it calculates the average of the elements pointed to by
medianIterator
and the following element.
- The
This setup is suitable for scenarios where a considerable amount of data is continuously added and median needs constant recalculating, providing an efficient solution by maintaining sorted elements and allowing quick median computation.
No comments yet.