Median of Two Sorted Arrays

Updated on 11 June, 2025
Median of Two Sorted Arrays header image

Problem Statement

Given two pre-sorted arrays named nums1 and nums2 with sizes m and n, the task is to find the median of these combined arrays. The median, a critical statistical measurement, provides a central value of a data set. The expected complex computation here, O(log (m+n)), indicates that the solution involves a logarithmic time complexity typical of divide-and-conquer strategies such as binary search. This efficiency is crucial for optimal performance, especially with larger data sizes.

Examples

Example 1

Input:

nums1 = [1,3], nums2 = [2]

Output:

2.00000

Explanation:

merged array = [1,2,3] and median is 2.

Example 2

Input:

nums1 = [1,2], nums2 = [3,4]

Output:

2.50000

Explanation:

merged array = [1,2,3,4] and median is (2 + 3) / 2 = 2.5.

Constraints

  • nums1.length == m
  • nums2.length == n
  • 0 <= m <= 1000
  • 0 <= n <= 1000
  • 1 <= m + n <= 2000
  • -10⁶ <= nums1[i], nums2[i] <= 10⁶

Approach and Intuition

The median of a dataset divides it into two equal parts, and finding the median in a single sorted array is straightforward. The real challenge arises when we need to integrate two arrays without merging them, aiming for optimal performance marked by the expected time complexity.

Logical Approach through Binary Search:

  1. Search on the Smaller Array: Always perform binary search on the smaller array (nums1) to minimize the search space.

  2. Partitioning Concept: Use binary search to find a partition of nums1 and a corresponding partition of nums2 such that:

    • All elements in the left partition of nums1 and nums2 are less than or equal to all elements in the right partition of nums1 and nums2.
  3. Binary Search Adjustments: While searching:

    • If maxLeft1 > minRight2, move left.

    • If maxLeft2 > minRight1, move right.

    • Once a valid partition is found:

      • If total length is odd, the median is max(maxLeft1, maxLeft2).
      • If even, the median is the average of max(maxLeft1, maxLeft2) and min(minRight1, minRight2).

Why It Works:

  • This approach avoids merging the arrays explicitly.
  • By using binary search, the time complexity is reduced to O(log(min(m,n))) — aligning with the desired efficiency.
  • It ensures correctness by maintaining valid partitions as the search progresses.

Solutions

  • C++
  • Java
  • C
  • JavaScript
  • Python
cpp
class Solution {
public:
    double findMedian(vector<int>& nums1, vector<int>& nums2) {
        if (nums1.size() > nums2.size()) {
            return findMedian(nums2, nums1);
        }

        int size1 = nums1.size(), size2 = nums2.size();
        int low = 0, high = size1;

        while (low <= high) {
            int partitionX = (low + high) / 2;
            int partitionY = (size1 + size2 + 1) / 2 - partitionX;

            int maxLeftX = (partitionX == 0) ? INT_MIN : nums1[partitionX - 1];
            int minRightX = (partitionX == size1) ? INT_MAX : nums1[partitionX];
            int maxLeftY = (partitionY == 0) ? INT_MIN : nums2[partitionY - 1];
            int minRightY = (partitionY == size2) ? INT_MAX : nums2[partitionY];

            if (maxLeftX <= minRightY && maxLeftY <= minRightX) {
                if ((size1 + size2) % 2 == 0) {
                    return (max(maxLeftX, maxLeftY) +
                            min(minRightX, minRightY)) /
                           2.0;
                } else {
                    return max(maxLeftX, maxLeftY);
                }
            } else if (maxLeftX > minRightY) {
                high = partitionX - 1;
            } else {
                low = partitionX + 1;
            }
        }

        return 0.0;
    }
};

The given C++ code defines a solution to find the median of two sorted arrays. The approach uses a binary search algorithm for achieving an efficient O(log(min(n,m))) time complexity, where n and m are the sizes of the two arrays. Here is a concise breakdown of how the solution works:

  • The function starts by ensuring the smaller array (nums1) is always the first parameter. This simplifies the logic required for managing the different array sizes.
  • Variables size1 and size2 are initialized to store the sizes of nums1 and nums2 respectively.
  • Two pointers, low and high, are initialized to iterate over the smaller array (nums1) with high initialized to size1.
  • A while loop continues until low exceeds high. In each iteration:
    • Calculate partitions partitionX and partitionY to split both arrays into two halves such that combined left half and right half may physically represent a split of a median.
    • Determine maxLeftX, minRightX, maxLeftY, and minRightY. These represent the maximum and minimum elements on either side of partitions of both arrays.
    • Two key conditions check the correctness of the partition:
      • If maxLeftX is less than or equal to minRightY and maxLeftY is less than or equal to minRightX, the correct partition is presumably found. Depending on whether the total number of elements is odd or even, the median is calculated differently:
        • If even, the median is the average of the maximum of the left elements and the minimum of the right elements.
        • If odd, the median is the maximum of the left elements.
      • If maxLeftX is greater than minRightY, adjust the high pointer to narrow the search.
      • If maxLeftX is less than minRightY, adjust the low pointer to expand the search.
  • If no valid partition is found within the calculation loop, the function returns 0.0 as a default fall-through.

This algorithm efficiently handles the search for the median through selective partitioning of the combined array space using pointers and comparisons, appropriate for problems where the direct median of unmerged sorted arrays is needed.

java
class Solution {
    public double calculateMedian(int[] array1, int[] array2) {
        if (array1.length > array2.length) {
            return calculateMedian(array2, array1);
        }

        int x = array1.length, y = array2.length;
        int low = 0, high = x;

        while (low <= high) {
            int partitionX = (low + high) / 2;
            int partitionY = (x + y + 1) / 2 - partitionX;

            int maxLeftX = (partitionX == 0) ? Integer.MIN_VALUE : array1[partitionX - 1];
            int minRightX = (partitionX == x) ? Integer.MAX_VALUE : array1[partitionX];
            int maxLeftY = (partitionY == 0) ? Integer.MIN_VALUE : array2[partitionY - 1];
            int minRightY = (partitionY == y) ? Integer.MAX_VALUE : array2[partitionY];

            if (maxLeftX <= minRightY && maxLeftY <= minRightX) {
                if ((x + y) % 2 == 0) {
                    return (
                        (double)(Math.max(maxLeftX, maxLeftY) + Math.min(minRightX, minRightY)) / 2
                    );
                } else {
                    return (double)Math.max(maxLeftX, maxLeftY);
                }
            } else if (maxLeftX > minRightY) {
                high = partitionX - 1;
            } else {
                low = partitionX + 1;
            }
        }
        return 0.0;
    }
}

The code snippet provided in Java effectively calculates the median of two sorted arrays. This process fundamentally involves finding a position in the combined arrangement of both input arrays where elements on the left are less than or equal to those on the right, facilitating accurate median computation.

To achieve this:

  1. Check which of the two arrays is smaller and prioritize it for primary operations, ensuring minimal computational complexity.
  2. Initialize pointers for binary search on the smaller array.
  3. Iterate through, adjusting the pointers based on conditions derived from the current and adjacent indices values of both arrays.
  4. Calculate potential median values:
    • For even combined array length, average the centers.
    • For odd combined array length, select the appropriate central value.
  5. Adjust binary search boundaries based on comparison between elements around the partition indices.

Ensure correct placement between elements of two arrays at every step via binary search, efficiently finding the median even for large array sizes. The function returns a double, indicating that the median value could potentially be a non-integer, appropriate for combined scenarios of even and odd-numbered array lengths.

c
double medianOfArrays(int* array1, int size1, int* array2, int size2) {
    if (size1 > size2) {
        return medianOfArrays(array2, size2, array1, size1);
    }

    int len1 = size1, len2 = size2;
    int lo = 0, hi = len1;

    while (lo <= hi) {
        int partX = (lo + hi) / 2;
        int partY = (len1 + len2 + 1) / 2 - partX;

        int leftX = (partX == 0) ? INT_MIN : array1[partX - 1];
        int rightX = (partX == len1) ? INT_MAX : array1[partX];
        int leftY = (partY == 0) ? INT_MIN : array2[partY - 1];
        int rightY = (partY == len2) ? INT_MAX : array2[partY];

        if (leftX <= rightY && leftY <= rightX) {
            if ((len1 + len2) % 2 == 0) {
                return (secondMax(leftX, leftY) + secondMin(rightX, rightY)) / 2.0;
            } else {
                return secondMax(leftX, leftY);
            }
        } else if (leftX > rightY) {
            hi = partX - 1;
        } else {
            lo = partX + 1;
        }
    }

    return 0.0;
}

int secondMax(int x, int y) { return (x > y) ? x : y; }

int secondMin(int x, int y) { return (x < y) ? x : y; }

To calculate the median of two sorted arrays in C, the given function medianOfArrays effectively merges the logic of binary search with partitioning techniques. Here's an overview of how the function operates:

  • Begin by ensuring that the smaller array (array1) has fewer elements. If not, interchange the roles of array1 and array2.
  • Set variables for the length of each array (len1 and len2) and initialize indices lo (low) and hi (high) for the binary search boundaries, where hi is initialized to the length of the smaller array.
  • Use a binary search loop where:
    • Calculate partX and partY, dividing the combined array lengths to find the partition line that can potentially separate the combined arrays into two halves.
    • Define the boundary elements for these partitions from both arrays, using INT_MIN and INT_MAX for edge cases where the partition might overflow the bounds of the arrays.
    • Evaluate if these partitions correctly separate the combined array into two halves such that all elements on the left half are smaller than all elements on the right half.
  • If a valid partition is found, determine if the combined array length is odd or even:
    • For an even length, compute the median as the average of the maximum of left elements and the minimum of right elements.
    • For an odd length, take the maximum of left elements as the median.
  • If the left element of array1 is greater than the right element of array2, adjust the high boundary of the search to refine the partition, and vice versa for the lower boundary.
  • The secondary functions secondMax and secondMin assist in efficient calculation of max and min between two values.

Important considerations:

  • Handle edge cases where there may be no elements on one side of a partition by using appropriate limits (INT_MIN and INT_MAX).
  • Ensure your arrays input into this function are already sorted as the approach assumes sorted arrays.
  • The logic cleverly reduces the problem size and utilizes effective use of binary search which keeps the time complexity tightly controlled.
js
var calculateMedian = function (array1, array2) {
    if (array1.length > array2.length) {
        let swap = array1;
        array1 = array2;
        array2 = swap;
    }

    let len1 = array1.length,
        len2 = array2.length;
    let minIndex = 0,
        maxIndex = len1;

    while (minIndex <= maxIndex) {
        let i = Math.floor((minIndex + maxIndex) / 2);
        let j = Math.floor((len1 + len2 + 1) / 2 - i);

        let leftMax1 = i == 0 ? Number.MIN_SAFE_INTEGER : array1[i - 1];
        let rightMin1 = i == len1 ? Number.MAX_SAFE_INTEGER : array1[i];
        let leftMax2 = j == 0 ? Number.MIN_SAFE_INTEGER : array2[j - 1];
        let rightMin2 = j == len2 ? Number.MAX_SAFE_INTEGER : array2[j];

        if (leftMax1 <= rightMin2 && leftMax2 <= rightMin1) {
            if ((len1 + len2) % 2 == 0) {
                return (
                    (Math.max(leftMax1, leftMax2) +
                        Math.min(rightMin1, rightMin2)) /
                    2.0
                );
            } else {
                return Math.max(leftMax1, leftMax2);
            }
        } else if (leftMax1 > rightMin2) {
            maxIndex = i - 1;
        } else {
            minIndex = i + 1;
        }
    }
    return 0.0;
};

The provided JavaScript solution efficiently finds the median of two sorted arrays. The function calculateMedian accepts two arrays (array1 and array2) as inputs. Here's a breakdown of how the solution works:

  • Initially, the function checks if array1 is longer than array2. If true, it swaps them. This ensures that array1 is always the shorter array.
  • Important indices and lengths are initialized. Variables minIndex and maxIndex determine the current search range on array1.
  • The function employs a binary search. In each iteration:
    • It calculates indices i and j to partition both arrays such that elements on the left are less than or equal to elements on the right.
    • It then defines the maximum element on the left and the minimum element on the right for both partitions.
    • Depending on the values of these 'max' and 'min' elements, the function decides if a valid partition has been found:
      • If a valid partition is found, it checks if the combined length of arrays is odd or even to return the appropriate median.
      • If no valid partition is found, it adjusts the search range by modifying minIndex or maxIndex.
  • The loop continues until a valid median is found.

The solution handles edge cases efficiently by using Number.MIN_SAFE_INTEGER and Number.MAX_SAFE_INTEGER for comparisons when partitions might exceed array boundaries (i.e., i or j being 0 or equal to the length of the array).

This method is optimal for finding the median of two sorted arrays, especially useful in scenarios requiring time-efficient and space-efficient computations, typical in statistical analyses and data-crunching applications.

python
class Solution:
    def medianOfArrays(self, array1: List[int], array2: List[int]) -> float:
        if len(array1) > len(array2):
            return self.medianOfArrays(array2, array1)

        x, y = len(array1), len(array2)
        low, high = 0, x

        while low <= high:
            partitionX = (low + high) // 2
            partitionY = (x + y + 1) // 2 - partitionX

            maxLeftX = float("-inf") if partitionX == 0 else array1[partitionX - 1]
            minRightX = float("inf") if partitionX == x else array1[partitionX]
            maxLeftY = float("-inf") if partitionY == 0 else array2[partitionY - 1]
            minRightY = float("inf") if partitionY == y else array2[partitionY]

            if maxLeftX <= minRightY and maxLeftY <= minRightX:
                if (x + y) % 2 == 0:
                    return (max(maxLeftX, maxLeftY) + min(minRightX, minRightY)) / 2
                else:
                    return max(maxLeftX, maxLeftY)
            elif maxLeftX > minRightY:
                high = partitionX - 1
            else:
                low = partitionX + 1

This Python solution efficiently calculates the median of two sorted arrays using binary search, making it O(log(min(n, m))) in time complexity where n and m are the lengths of the two arrays. Here's a breakdown of the method:

  • The function medianOfArrays first ensures that the smaller array is the first parameter for optimization.
  • The variables x and y represent the lengths of array1 and array2. The search ranges low and high are initialized for the binary search on the smaller array.
  • Inside the while loop, the partitions of both arrays are determined.
  • Boundary values for both left and right sides of the partitions (maxLeftX, maxLeftY and minRightX, minRightY) are defined to handle edges and simplify comparisons between elements around the partition.
  • A conditional check ensures the median rules are satisfied (left elements <= right elements). If the combined length of the arrays is even, the median is the average of the middle two elements across both arrays; if odd, it is the maximum of the left elements.
  • If the elements from array1 at the left of the partition are greater than the elements from array2 at the right, the binary search space is reduced.
  • Otherwise, the search space extends, refining the search for the median position.

This approach leverages the binary search mechanism to minimize operations and effectively handle larger datasets, providing a robust and scalable solution for finding the median across two sorted arrays.

Comments

No comments yet.