Find Median from Data Stream

Hardheapdesigntwo-heaps
Category: Heap / Priority Queue
Companies that ask this question:
AmazonGoogleFacebookMicrosoftApple

Approach

Find Median from Data Stream

Problem Statement

The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value, and the median is the mean of the two middle values.

Implement the MedianFinder class:

  • MedianFinder() initializes the object
  • void addNum(int num) adds the integer num from the data stream
  • double findMedian() returns the median of all elements so far

Examples

Example 1

Input:
["MedianFinder", "addNum", "addNum", "findMedian", "addNum", "findMedian"]
[[], [1], [2], [], [3], []]

Output:
[null, null, null, 1.5, null, 2.0]

Explanation:
MedianFinder medianFinder = new MedianFinder();
medianFinder.addNum(1);    // arr = [1]
medianFinder.addNum(2);    // arr = [1, 2]
medianFinder.findMedian(); // return 1.5 (1 + 2) / 2
medianFinder.addNum(3);    // arr = [1, 2, 3]
medianFinder.findMedian(); // return 2.0

Intuition

Challenge: Need median quickly as numbers stream in.

Naive: Sort after each add → O(n log n) per add

Key Insight: Use two heaps to maintain sorted halves!

  • Max-heap (left half): stores smaller half of numbers
  • Min-heap (right half): stores larger half of numbers

Invariant:

  • Max-heap top ≤ Min-heap top (left max ≤ right min)
  • Heaps differ in size by at most 1

Median:

  • If equal sizes: average of both tops
  • If unequal: top of larger heap

Pattern Recognition

This is a Two-Heaps Balance problem:

  • Maintain balance between two heaps
  • Max-heap for left, min-heap for right
  • Classic streaming median technique

Approach

Two Heaps Solution

Data Structures:

  • max_heap: Smaller half (left side)
  • min_heap: Larger half (right side)

addNum(num):

  1. Add to max_heap (left)
  2. Balance: Move max_heap top to min_heap
  3. Rebalance if sizes differ by > 1
  4. Maintain: max_heap.size() ≥ min_heap.size()

findMedian():

  • If equal sizes: (max_heap.top() + min_heap.top()) / 2
  • If max_heap larger: max_heap.top()

Why this works: Heaps keep data partially sorted with O(log n) operations!

Edge Cases

  • Single element
  • Two elements (even count)
  • Odd vs even number of elements
  • Negative numbers
  • Duplicate values
  • Very large/small values

Complexity Analysis

  • addNum: O(log n)

    • 2-3 heap operations, each O(log n)
  • findMedian: O(1)

    • Just peek at heap tops
  • Space: O(n)

    • Store all n numbers across both heaps

Much better than:

  • Sorting every time: O(n log n) per add
  • Binary search insertion: O(n) per add

Solution

java
1import java.util.PriorityQueue;
2import java.util.Collections;
3
4class MedianFinder {
5    private PriorityQueue<Integer> maxHeap;  // Smaller half (left)
6    private PriorityQueue<Integer> minHeap;  // Larger half (right)
7
8    public MedianFinder() {
9        maxHeap = new PriorityQueue<>(Collections.reverseOrder());
10        minHeap = new PriorityQueue<>();
11    }
12
13    public void addNum(int num) {
14        // Add to max_heap (left half)
15        maxHeap.offer(num);
16
17        // Balance: Ensure max_heap top <= min_heap top
18        if (!maxHeap.isEmpty() && !minHeap.isEmpty() && maxHeap.peek() > minHeap.peek()) {
19            minHeap.offer(maxHeap.poll());
20        }
21
22        // Maintain size property: max_heap.size >= min_heap.size
23        if (maxHeap.size() < minHeap.size()) {
24            maxHeap.offer(minHeap.poll());
25        } else if (maxHeap.size() > minHeap.size() + 1) {
26            minHeap.offer(maxHeap.poll());
27        }
28    }
29
30    public double findMedian() {
31        if (maxHeap.size() > minHeap.size()) {
32            // Odd count, median is top of max_heap
33            return (double) maxHeap.peek();
34        } else {
35            // Even count, median is average of both tops
36            return (maxHeap.peek() + minHeap.peek()) / 2.0;
37        }
38    }
39}
40
Loading visualizer...