295. Find Median from Data Stream
Problem Description
The MedianFinder
class is designed to find the median of a data stream. The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value, and the median is the mean of the two middle values.
- Implement the MedianFinder class:
MedianFinder()
initializes the MedianFinder object.void addNum(int num)
adds the integer num from the data stream to the data structure.double findMedian()
returns the median of all elements so far. Answers within 10⁻⁵ of the actual answer will be accepted.
Example 1:
Input
["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"]
[[], [1], [2], [], [3], []]
Output
[null, null, null, 1.5, null, 2.0]
Explanation
MedianFinder medianFinder = new MedianFinder();
medianFinder.addNum(1); // arr = [1]
medianFinder.addNum(2); // arr = [1, 2]
medianFinder.findMedian(); // return 1.5 (i.e., (1 + 2) / 2)
medianFinder.addNum(3); // arr[1, 2, 3]
medianFinder.findMedian(); // return 2.0
Constraints:
-10⁵ <= num <= 10⁵
- There will be at least one element in the data structure before calling
findMedian
. - At most
5 * 10⁴
calls will be made toaddNum
andfindMedian
.
Follow up:
- If all integer numbers from the stream are in the range [0, 100], how would you optimize your solution?
- If 99% of all integer numbers from the stream are in the range [0, 100], how would you optimize your solution?
Solution
Python Solution
class MedianFinder:
def __init__(self):
# max heap for the first half
self.small = []
# min heap for the second half
self.large = []
def addNum(self, num: int) -> None:
# Always add to small (max heap) first
heapq.heappush(self.small, -num)
# Make sure every num in small <= every num in large
if self.small and self.large and -self.small[0] > self.large[0]:
val = -heapq.heappop(self.small)
heapq.heappush(self.large, val)
# Handle uneven size
if len(self.small) > len(self.large) + 1:
val = -heapq.heappop(self.small)
heapq.heappush(self.large, val)
if len(self.large) > len(self.small):
val = heapq.heappop(self.large)
heapq.heappush(self.small, -val)
def findMedian(self) -> float:
if len(self.small) > len(self.large):
return -self.small[0]
return (-self.small[0] + self.large[0]) / 2
Time Complexity:
- addNum: O(log n)
- findMedian: O(1)
Space Complexity: O(n)
To store n numbers in the heaps.
Java Solution
class MedianFinder {
private PriorityQueue small; // max heap
private PriorityQueue large; // min heap
public MedianFinder() {
small = new PriorityQueue<>((a, b) -> b - a); // max heap
large = new PriorityQueue<>(); // min heap
}
public void addNum(int num) {
// Always add to small (max heap) first
small.offer(num);
// Make sure every num in small <= every num in large
if (!small.isEmpty() && !large.isEmpty() && small.peek() > large.peek()) {
large.offer(small.poll());
}
// Handle uneven size
if (small.size() > large.size() + 1) {
large.offer(small.poll());
}
if (large.size() > small.size()) {
small.offer(large.poll());
}
}
public double findMedian() {
if (small.size() > large.size()) {
return small.peek();
}
return (small.peek() + large.peek()) / 2.0;
}
}
Time Complexity:
- addNum: O(log n)
- findMedian: O(1)
Space Complexity: O(n)
To store n numbers in the heaps.
C++ Solution
class MedianFinder {
private:
priority_queue small; // max heap
priority_queue, greater> large; // min heap
public:
MedianFinder() {}
void addNum(int num) {
// Always add to small (max heap) first
small.push(num);
// Make sure every num in small <= every num in large
if (!small.empty() && !large.empty() && small.top() > large.top()) {
large.push(small.top());
small.pop();
}
// Handle uneven size
if (small.size() > large.size() + 1) {
large.push(small.top());
small.pop();
}
if (large.size() > small.size()) {
small.push(large.top());
large.pop();
}
}
double findMedian() {
if (small.size() > large.size()) {
return small.top();
}
return (small.top() + large.top()) / 2.0;
}
};
Time Complexity:
- addNum: O(log n)
- findMedian: O(1)
Space Complexity: O(n)
To store n numbers in the heaps.
JavaScript Solution
var MedianFinder = function() {
this.nums = [];
};
/**
* @param {number} num
* @return {void}
*/
MedianFinder.prototype.addNum = function(num) {
if (this.nums.length === 0) {
this.nums.push(num);
return;
}
// Binary search to find insertion position
let left = 0;
let right = this.nums.length;
while (left < right) {
let mid = Math.floor((left + right) / 2);
if (this.nums[mid] < num) {
left = mid + 1;
} else {
right = mid;
}
}
this.nums.splice(left, 0, num);
};
/**
* @return {number}
*/
MedianFinder.prototype.findMedian = function() {
const n = this.nums.length;
if (n % 2 === 1) {
return this.nums[Math.floor(n/2)];
}
return (this.nums[n/2-1] + this.nums[n/2]) / 2;
};
Time Complexity:
- addNum: O(n) due to array insertion
- findMedian: O(1)
Space Complexity: O(n)
To store n numbers in the array.
C# Solution
public class MedianFinder {
private PriorityQueue small; // max heap
private PriorityQueue large; // min heap
public MedianFinder() {
small = new PriorityQueue(); // max heap
large = new PriorityQueue(); // min heap
}
public void AddNum(int num) {
// Always add to small (max heap) first
small.Enqueue(num, -num);
// Make sure every num in small <= every num in large
if (small.Count > 0 && large.Count > 0 && -small.Peek() > large.Peek()) {
large.Enqueue(-small.Dequeue(), -small.Dequeue());
}
// Handle uneven size
if (small.Count > large.Count + 1) {
large.Enqueue(-small.Dequeue(), -small.Dequeue());
}
if (large.Count > small.Count) {
small.Enqueue(large.Dequeue(), -large.Dequeue());
}
}
public double FindMedian() {
if (small.Count > large.Count) {
return -small.Peek();
}
return (-small.Peek() + large.Peek()) / 2.0;
}
}
Time Complexity:
- addNum: O(log n)
- findMedian: O(1)
Space Complexity: O(n)
To store n numbers in the heaps.
Approach Explanation
The solution uses two heaps to maintain the median:
- Key Insight:
- Use max heap for smaller half
- Use min heap for larger half
- Keep heaps balanced or max heap one larger
- Median is either max heap top or average
- Algorithm Steps:
- Add to max heap first
- Balance values between heaps
- Maintain size property
- Calculate median from tops
Example walkthrough:
- Add 1: small=[1], large=[]
- Add 2: small=[1], large=[2]
- Add 3: small=[1], large=[2,3]
- Rebalance: small=[1,2], large=[3]
Optimization insights:
- Constant time median finding
- Logarithmic insertion time
- Automatic sorting through heaps
- Minimal memory overhead
Follow-up Solutions:
- For [0,100]: Use counting sort array
- For 99% in [0,100]: Use bucket + heap
- Space-time tradeoff possible
- Consider streaming requirements