Subarrays with K Different Integers

Updated on 09 July, 2025
Subarrays with K Different Integers header image

Problem Statement

The task is to find the number of "good subarrays" within an integer array nums, given an integer k. A subarray is regarded as "good" when it contains exactly k distinct integers. It is important to understand that a subarray must be a contiguous segment of the parent array nums. The distinction between any two subarrays will depend on their content diversity relative to k, making the approach to this problem non-trivial especially as k and the size of nums increase.

Examples

Example 1

Input:

nums = [1,2,1,2,3], k = 2

Output:

7

Explanation:

Subarrays formed with exactly 2 different integers: [1,2], [2,1], [1,2], [2,3], [1,2,1], [2,1,2], [1,2,1,2]

Example 2

Input:

nums = [1,2,1,3,4], k = 3

Output:

3

Explanation:

Subarrays formed with exactly 3 different integers: [1,2,1,3], [2,1,3], [1,3,4].

Constraints

  • 1 <= nums.length <= 2 * 104
  • 1 <= nums[i], k <= nums.length

Approach and Intuition

Given the problem's requirement to count subarrays with exactly k distinct integers, one can anticipate several foundational steps and intuitions:

  1. Direct Counting is Infeasible: With the array length potentially reaching up to 20,000, exhaustive enumeration and verification of all possible subarrays will result in a super-polynomial time complexity, which is computationally infeasible.

  2. Sliding Window Technique: A viable approach to solve this problem optimally is to use a sliding window method combined with a hash map. This technique helps dynamically track the count of distinct integers as the window expands or contracts.

  3. Handling Distinct Counts: Every extension or contraction of the window can be handled efficiently by updating the hash map, which tracks the count of each number inside the current window:

    • When a new number is added to the window (i.e., expanding), increase its count in the map.
    • When a number is removed (i.e., window is contracting), reduce its count, potentially removing it from the map if its count drops to zero.
  4. Double Counting via Sliding Window Adjustment: The challenge intensifies with needing to ensure that each subarray has exactly k distinct numbers. One approach is to calculate subarrays with at most k distinct integers and subtract the count of subarrays with at most k-1 distinct integers from it.

  5. Efficiency Concerns: Since every operation inside the window (adding or removing an element) can be achieved in constant time due to the hash map, the sliding window technique remains efficient. The overall time complexity, in the ideal case, is linear relative to the size of nums multiplied by the operations performed per element (updating the map).

By applying these strategies, one can accurately and efficiently determine the number of good subarrays for given constraints, optimizing both performance and correctness.

Solutions

  • C++
cpp
class Solution {
   public:
    int countKDistinctSubarrays(vector<int>& arr, int k) {
        vector<int> count(arr.size() + 1, 0);
        int totalSubarrays = 0, start = 0, end = 0, windowSize = 0;
    
        while (end < arr.size()) {
            if (++count[arr[end++]] == 1) {
                k--;
            }
    
            if (k < 0) {
                --count[arr[start++]];
                k++;
                windowSize = 0;
            }
    
            if (k == 0) {
                while (count[arr[start]] > 1) {
                    --count[arr[start++]];
                    windowSize++;
                }
                totalSubarrays += (windowSize + 1);
            }
        }
        return totalSubarrays;
    }
};

In the provided C++ code, the solution focuses on finding the number of subarrays within a given array that contain exactly k distinct integers. It uses a sliding window approach with two pointers, start and end, to efficiently determine the count of such subarrays without the need to check every possible subarray explicitly. Here's a breakdown of how the code tackles the problem:

  • A function countKDistinctSubarrays is declared, which takes a vector of integers arr and an integer k as parameters, representing the array and the number of distinct integers in the subarrays, respectively.
  • Initially, a count vector is created to keep track of the frequency of each element in the current window. The totalSubarrays variable is initialized to zero to store the result.
  • The outer while loop progresses the end pointer from the beginning to the end of the array, incrementing the count of each element as it is included in the window. If a new distinct element is added (count[arr[end]] == 1), k is decremented.
  • When k becomes negative (meaning the window has more than k distinct integers), the window is adjusted by moving the start pointer forward and reducing the counts accordingly until k is non-negative.
  • An inner while loop further adjusts the start pointer to ensure that every possible subarray starting from the current position of start and ending at end - 1 is counted. The count of such valid subarrays is then incremented based on the size of the window freeze frame, which ensures subarrays containing exactly k distinct integers are counted.
  • Finally, the function returns totalSubarrays, which is the count of subarrays containing exactly k distinct integers.

The beauty of this approach lies in its efficiency—by maintaining a dynamic sliding window of the array, the solution avoids the computational overhead of checking each subarray individually, which would be substantially slower especially for large arrays. The sliding window technique, combined with careful counting of distinct elements and adjusting the window size, offers an optimal solution to the problem.

  • Java
java
class Solution {
    public int countSubarraysWithKDistinct(int[] arr, int k) {
        int[] countMap = new int[arr.length + 1];
        int total = 0;
        int start = 0;
        int end = 0;
        int count = 0;
    
        while (end < arr.length) {
            if (countMap[arr[end++]]++ == 0) {
                k--;
            }
    
            if (k < 0) {
                --countMap[arr[start++]];
                k++;
                count = 0;
            }
    
            if (k == 0) {
                while (countMap[arr[start]] > 1) {
                    --countMap[arr[start++]];
                    count++;
                }
                total += (count + 1);
            }
        }
        return total;
    }
}

This solution aims to tackle the problem of counting subarrays in an array that contain exactly K different integers. It uses an array-based sliding window approach to efficiently keep track of the number of distinct integers and calculates the subarray counts.

  • The function countSubarraysWithKDistinct takes two parameters: an integer array arr and an integer k representing the desired number of distinct integers.
  • A countMap array is utilized to store the frequency of each integer in the current window of the array.
  • Variables total, start, end, and count are initialized to manage the counting and the window's bounds during the traversal of the array.
  • As the end point of the sliding window expands (end++), the array element at this position increments in the countMap and checks if a new integer has been added (countMap[arr[end]]++ == 0). If yes, decrement k.
  • If k becomes less than zero, meaning the window has more than k distinct integers, the algorithm shrinks the window from the start side until k is no longer negative. This is done by incrementing the start pointer and adjusting the map and k accordingly.
  • When k is exactly zero, indicating the window has exactly k distinct integers, count possible subarrays originating from this window configuration. It increments count for all duplicate start elements and adds these to total.
  • The algorithm keeps adjusting the window using the start and end pointers and updates the total count of valid subarrays.
  • Finally, total is returned, providing the count of all subarrays in arr that contain exactly k different integers.

This Java implementation efficiently manages the exploration of subarrays using both increasing and decreasing the bounds of a sliding window based on the status of k and updates the count of distinct integers efficiently using an array to represent current integer frequencies within the window.

  • Python
python
class Solution:
    def countDistinctSubarrays(self, arr: List[int], distinct_num: int) -> int:
        count_dict = [0] * (len(arr) + 1)
    
        total_subarrays = 0
        window_start = 0
        window_end = 0
        active_count = 0
    
        while window_end < len(arr):
            count_dict[arr[window_end]] += 1
    
            if count_dict[arr[window_end]] == 1:
                distinct_num -= 1
    
            if distinct_num < 0:
                count_dict[arr[window_start]] -= 1
                if count_dict[arr[window_start]] == 0:
                    distinct_num += 1
                window_start += 1
                active_count = 0
    
            if distinct_num == 0:
                while count_dict[arr[window_start]] > 1:
                    count_dict[arr[window_start]] -= 1
                    window_start += 1
                    active_count += 1
                total_subarrays += (active_count + 1)
    
            window_end += 1
    
        return total_subarrays

This Python3 solution defines a method countDistinctSubarrays in the Solution class to solve the problem of finding subarrays containing exactly K distinct integers. Here's how the implementation works:

  • Initialize a list count_dict of size len(arr) + 1 to zero, to keep track of the count of each element within the current window of the subarray.
  • Set total_subarrays, window_start, window_end, and active_count to 0.
  • Traverse the array using the window_end variable. For each element at window_end, increment its count in count_dict.
  • Decrement distinct_num when a new distinct element is encountered by checking if its count becomes one.
  • If distinct_num becomes negative, adjust the window_start to potentially remove some elements from the current subarray to try and bring back the count of distinct integers within the required limit.
  • Whenever distinct_num reaches zero, meaning the subarray between window_start and window_end has exactly K distinct integers, further adjust window_start and count the number of valid subarrays.
  • Continue this until window_end has traversed the entire array.
  • Return the total_subarrays count, which represents the number of subarrays having exactly K distinct integers.

This approach utilises the sliding window technique combined with a frequency count array, effectively allowing the program to consider each possible subarray and record the valid ones as per the given conditions. Adjustments of the window_start help ensure that once a valid subarray configuration is disturbed (due to an exceeding number of distinct integers), it is realigned to remain within the problem constraints.

Efficient management of the subarray window and an accurate count of valid configurations ensure that the solution is both effective and optimal for large inputs.

Comments

No comments yet.