Find K Pairs with Smallest Sums

Updated on 26 May, 2025
Find K Pairs with Smallest Sums header image

Problem Statement

In this problem, we are provided with two integer arrays, nums1 and nums2, both sorted in non-decreasing order. The task is to find k pairs (u,v) where u belongs to nums1 and v belongs to nums2, such that the sum of the numbers in each pair is as small as possible in comparison to all other possible pairs formed by taking one element from each array. The objective is to return these k pairs with the smallest possible sums.

Examples

Example 1

Input:

nums1 = [1,7,11], nums2 = [2,4,6], k = 3

Output:

[[1,2],[1,4],[1,6]]

Explanation:

The first 3 pairs are returned from the sequence: [1,2],[1,4],[1,6],[7,2],[7,4],[11,2],[7,6],[11,4],[11,6]

Example 2

Input:

nums1 = [1,1,2], nums2 = [1,2,3], k = 2

Output:

[[1,1],[1,1]]

Explanation:

The first 2 pairs are returned from the sequence: [1,1],[1,1],[1,2],[2,1],[1,2],[2,2],[1,3],[1,3],[2,3]

Constraints

  • 1 <= nums1.length, nums2.length <= 105
  • -109 <= nums1[i], nums2[i] <= 109
  • nums1 and nums2 both are sorted in non-decreasing order.
  • 1 <= k <= 104
  • k <= nums1.length * nums2.length

Approach and Intuition

Given the sorted nature of nums1 and nums2, the smallest pairs are likely to originate from the smallest elements in both arrays. The challenge lies in efficiently finding these pairs without checking each combination, as this would be computationally expensive given the potential size of the inputs.

Example-driven Understanding:

From the provided examples:

Example 1

  • Input: nums1 = [1,7,11], nums2 = [2,4,6], k = 3
  • Output: [[1,2],[1,4],[1,6]]

Observation:

  • The smallest number in nums1 is 1, and the smallest numbers in nums2 are 2, 4, and 6.
  • By forming pairs starting from the smallest elements of nums1, we get 1+2, 1+4, and 1+6.

Explanation:

  • Since we need only three pairs, and given that the array sizes are smaller than k, we easily concatenate the smallest possible combinations from the start of both arrays.

Example 2

  • Input: nums1 = [1,1,2], nums2 = [1,2,3], k = 2
  • Output: [[1,1],[1,1]]

Observation:

  • The first elements of nums1 are twins both 1, allowing repeated minimal sums by pairing each with the smallest element of nums2, which is also 1.

Explanation:

  • We form the smallest sum, 1+1, twice, given the repetition in nums1 and the size of k.

Steps to Formulate a Solution:

  1. Use a min-heap to keep track of the next smallest sum and from which indices it comes.
  2. Initially, populate the heap with pairs consisting of elements from each nums1 and the first element of nums2, since this will always include the smallest pair.
  3. Track indices of elements in nums1 and nums2 that form these pairs to avoid recomputation and to know where the next element after the current smallest should come from.
  4. Extract the smallest sum from the heap, and add the next possible pair to the heap involving the next element from nums2 unless it's exhausted. This helps in limiting the growth of the heap and focuses only on potential candidates for smallest pairs.
  5. Continue this until we have extracted k smallest sums/pairs.

Considerations:

  • Take advantage of the problem's inherent ordered structure to optimize the search rather than relying on brute force.
  • Use data structures like min-heaps to efficiently manage and retrieve the next smallest pairs.
  • Ensure to handle edge cases such as equal elements within arrays or larger values of k relative to input sizes.

Solutions

  • C++
  • Java
  • Python
cpp
class Solution {
public:
    vector<vector<int>> findKSmallestPairs(vector<int>& nums1, vector<int>& nums2, int k) {
        int size1 = nums1.size();
        int size2 = nums2.size();

        vector<vector<int>> result;
        set<pair<int, int>> seen;

        priority_queue<pair<int, pair<int, int>>, vector<pair<int, pair<int, int>>>,
                       greater<pair<int, pair<int, int>>>> pq;
        pq.push({nums1[0] + nums2[0], {0, 0}});
        seen.insert({0, 0});

        while (k-- && !pq.empty()) {
            auto curr = pq.top();
            pq.pop();
            int i = curr.second.first;
            int j = curr.second.second;

            result.push_back({nums1[i], nums2[j]});

            if (i + 1 < size1 && seen.find({i + 1, j}) == seen.end()) {
                pq.push({nums1[i + 1] + nums2[j], {i + 1, j}});
                seen.insert({i + 1, j});
            }

            if (j + 1 < size2 && seen.find({i, j + 1}) == seen.end()) {
                pq.push({nums1[i] + nums2[j + 1], {i, j + 1}});
                seen.insert({i, j + 1});
            }
        }

        return result;
    }
};

The given C++ code defines a method to find k pairs with the smallest sums from two integer arrays, nums1 and nums2. Follow this analysis to understand the implementation strategy:

  • Two integer vectors, nums1 and nums2, are the sources of the numbers. Given an integer k, the task is to identify k pairs of numbers (one from each vector) such that their combined sum is among the smallest possible.
  • The solution utilizes a priority queue to manage the minimum sum considerations efficiently. This data structure allows the algorithm to always process the next smallest possible sum.
  • The priority queue stores elements of form pair<int, pair<int, int>>, where the first int is the sum of the pair of numbers indexed by the nested pair<int, int>, which keeps track of the indices in nums1 and nums2.
  • Initialization of the priority queue and a set seen occurs by adding the pair resulting from the first elements of nums1 and nums2. The set acts as a bookkeeping tool to avoid processing the same pair multiple times.
  • Each iteration of the while loop processes the smallest current sum, extracting it from the priority queue, and the code ensures that only unseen pairs of subsequent elements in nums1 or nums2 are considered next.
  • Potential new pairs formed by either advancing in nums1 or in nums2 are added to the queue as long as they haven't been seen before.
  • The process halts when k pairs have been found or the priority queue is exhausted.
  • The result vector<vector<int>> accumulates the k pairs with the smallest sums.

The result, which stores the pairs of numbers, is returned as a list of lists of integers, where each list contains two integers. This solution is effective in managing the consideration of multiple pair combinations through a min-heap (priority queue) and ensuring no redundant pairs are processed with the aid of a tracking set.

java
class Solution {
    public List<List<Integer>> findKSmallestPairs(int[] firstArray, int[] secondArray, int k) {
        int len1 = firstArray.length;
        int len2 = secondArray.length;

        List<List<Integer>> results = new ArrayList<>();
        Set<Pair<Integer, Integer>> tracked = new HashSet<>();

        PriorityQueue<int[]> heap = new PriorityQueue<>((a, b)->(a[0] - b[0]));
        heap.offer(new int[]{firstArray[0] + secondArray[0], 0, 0});
        tracked.add(new Pair<Integer, Integer>(0, 0));

        while (k-- > 0 && !heap.isEmpty()) {
            int[] current = heap.poll();
            int i = current[1];
            int j = current[2];

            results.add(List.of(firstArray[i], secondArray[j]));

            if (i + 1 < len1 && !tracked.contains(new Pair<Integer, Integer>(i + 1, j))) {
                heap.offer(new int[]{firstArray[i + 1] + secondArray[j], i + 1, j});
                tracked.add(new Pair<Integer, Integer>(i + 1, j));
            }

            if (j + 1 < len2 && !tracked.contains(new Pair<Integer, Integer>(i, j + 1))) {
                heap.offer(new int[]{firstArray[i] + secondArray[j + 1], i, j + 1});
                tracked.add(new Pair<Integer, Integer>(i, j + 1));
            }
        }

        return results;
    }
}

The given Java program provides a solution to find K pairs from two arrays such that their sums are among the smallest. The method accomplishes this by utilizing a min-heap structure and a hash set to efficiently track and retrieve the smallest sums of pairs efficiently.

Here’s a breakdown of the solution’s primary components and approach:

  • Initialization:

    • Integers len1 and len2 hold the lengths of firstArray and secondArray respectively.
    • A List named results is created to store the resultant pairs.
    • A Set named tracked is utilized to keep track of indices already considered to avoid redundant processing.
    • A PriorityQueue (min-heap) heap is initialized to store arrays where the first element is the sum of a pair from both arrays and the next two elements are their respective indices in firstArray and secondArray.
  • Heap Operation:

    • The heap is initially populated with the smallest sum pair (from the first element of both arrays).
    • A loop runs up to k times or until the heap is empty. During each iteration:
      • The smallest sum pair is extracted from the heap.
      • The indices i and j are retrieved from the extracted array.
      • The respective pair values from firstArray and secondArray are added to the results.
      • The neighbor pairs (i+1, j) and (i, j+1) (if within bounds and not previously considered) are pushed into the heap along with their sum and updated indices.
  • Output:

    • After processing up to K smallest pairs or exhausting available pairs, the results list containing K smallest pairs is returned.

This structure ensures the program efficiently handles the computation and memory management by using the priority queue for direct access to the smallest elements and a set to prevent re-processing of the same pairs, optimizing both time and space complexity. Be sure to implement and handle edge cases such as when K exceeds the total number of possible pairs.

python
class Solution:
    def findKSmallestPairs(self, list1: List[int], list2: List[int], k: int) -> List[List[int]]:
        from heapq import heappush, heappop
        length1 = len(list1)
        length2 = len(list2)

        results = []
        visited_pairs = set()

        priority_queue = [(list1[0] + list2[0], (0, 0))]
        visited_pairs.add((0, 0))

        while k > 0 and priority_queue:
            value, (index1, index2) = heappop(priority_queue)
            results.append([list1[index1], list2[index2]])

            if index1 + 1 < length1 and (index1 + 1, index2) not in visited_pairs:
                heappush(priority_queue, (list1[index1 + 1] + list2[index2], (index1 + 1, index2)))
                visited_pairs.add((index1 + 1, index2))

            if index2 + 1 < length2 and (index1, index2 + 1) not in visited_pairs:
                heappush(priority_queue, (list1[index1] + list2[index2 + 1], (index1, index2 + 1)))
                visited_pairs.add((index1, index2 + 1))
            k = k - 1
        
        return results

The solution implements a function in Python to find the k pairs with the smallest sums from two given input lists. Here is a summary of how this solution works:

  • The function imports the necessary heappush and heappop functions from Python's heapq module to manage a priority queue.

  • The priority queue is initialized with the sum of the first elements of each list, along with their indices. A set named visited_pairs ensures that the same pair is not processed more than once.

  • The main part of the function operates within a while loop, processing elements until k pairs are identified or the priority queue is empty. In each iteration:

    • The smallest sum is popped from the heap, and the corresponding pair of elements from list1 and list2 is added to the result list.
    • It checks for possible new pairs by incrementing the index of the first element without moving to the next element of the second list, and vice versa. If a new valid pair is found, it is added to the heap and recorded in the visited_pairs set.
    • The counter k is decremented each time a pair is added to the result list.
  • The function returns the list of k pairs with the smallest sums.

This implementation is efficient due to the use of a min-heap, which allows for the selection of minimal sum pairs in logarithmic time per insertion or deletion, and ensures optimality and speed especially when k is smaller relative to the size of the input lists.

Comments

No comments yet.