Top K Frequent Elements

Updated on 30 June, 2025
Top K Frequent Elements header image

Problem Statement

In the given problem, we are asked to find the k most frequent elements from an integer array nums. The elements in the output can be in any order, but they must be the top k elements in terms of frequency within the array. This problem examines our ability to track and sort elements based on the frequency of occurrence.

Examples

Example 1

Input:

nums = [1,1,1,2,2,3], k = 2

Output:

[1,2]

Example 2

Input:

nums = [1], k = 1

Output:

[1]

Constraints

  • 1 <= nums.length <= 105
  • -104 <= nums[i] <= 104
  • k is in the range [1, the number of unique elements in the array].
  • It is guaranteed that the answer is unique.

Approach and Intuition

To solve the problem of finding the k most frequent elements in the array, you can follow these general steps:

  1. Count Frequency of Each Element:
    Use a hash map to store the frequency of each element in the array. The key in the hash map would be the element from the array, and the value would be the count of how often it appears.

  2. Sort Elements by Frequency:
    Convert the frequency hash map into a list of (element, frequency) pairs and sort this list by frequency in descending order.

  3. Extract Top k Elements:
    Once sorted, the first k elements of this list will represent the k most frequent elements. Extract these and return them.

  • Special Scenario Handling:
    • For an array with only one element or when k equals 1, the logic simplifies as the output will be the element itself, irrespective of its count.

These steps are optimal given the constraints, especially when considering the possible size of the input array and the range of elements it can include. This problem ensures that a unique answer is always possible, simplifying some aspects of the implementation.

Solutions

  • C++
  • Java
  • Python
cpp
class Solution {
private:
    vector<int> distinctElements;
    unordered_map<int, int> frequencyMap;
    
public:
    int rearrange(int start, int end, int pivot_idx) {
        int pivot_freq = frequencyMap[distinctElements[pivot_idx]];
            
        // Move pivot to end
        swap(distinctElements[pivot_idx], distinctElements[end]);
    
        // Organize less frequent to the left
        int store_idx = start;
        for (int i = start; i <= end; i++) {
            if (frequencyMap[distinctElements[i]] < pivot_freq) {
                swap(distinctElements[store_idx], distinctElements[i]);
                store_idx++;
            }
        }
    
        // Restore pivot to its proper position
        swap(distinctElements[end], distinctElements[store_idx]);
    
        return store_idx;
    }
    
    void selectK(int start, int end, int k_least) {
        if (start == end) return;
    
        // Random pile index for partitioning
        int pivot_idx = start + rand() % (end - start + 1);
            
        // Find the pivot's sorted index
        pivot_idx = rearrange(start, end, pivot_idx);
    
        // If pivot is already in the correct spot
        if (k_least == pivot_idx) {
            return;
        } else if (k_least < pivot_idx) {
            selectK(start, pivot_idx - 1, k_least);
        } else {
            selectK(pivot_idx + 1, end, k_least);
        }
    }
    
    vector<int> topKFrequent(vector<int>& nums, int k) {
        // Counting the frequency of each number
        for (int value : nums) {
            frequencyMap[value]++;
        }
    
        // Collect distinct numbers
        int count = frequencyMap.size();
        for (auto& pair : frequencyMap) {
            distinctElements.push_back(pair.first);
        }
    
        // Partial sort to find the kth most frequent element
        selectK(0, count - 1, count - k);
            
        // Retrieve the top k frequent elements
        vector<int> result(k);
        copy(distinctElements.begin() + count - k, distinctElements.end(), result.begin());
        return result;
    }
};

The provided C++ code solves the problem of finding the top K frequent elements in an array. It does this using a combination of a hash map and a quickselect algorithm, a variation inspired by quicksort. Here's a comprehensive overview of the solution:

  • Data Structures:

    • vector<int> distinctElements - holds unique elements from the input array.
    • unordered_map<int, int> frequencyMap - maps each element to its frequency in the input array.
  • Initial Setup:

    • Populate frequencyMap where each key corresponds to an integer from the input and its value is the frequency of that integer.
    • Extract the keys (distinct elements) from frequencyMap into distinctElements.
  • Quickselect Algorithm:

    • The rearrange method is a helper function that partitions elements around a pivot element based on their frequency. Elements less frequent than the pivot come before it, and more frequent elements come after.
    • The selectK method utilizes the rearrange method to find the correct position of the k-th most frequent element (count - k where count is the number of distinct elements). It recursively partitions the array to position the k-th largest frequency at its correct index in sorted order.
  • Finding the Top K Elements:

    • After selectK positions the k-th most frequent element correctly, the top K elements can be found at the end of the distinctElements array.
    • A subvector is then created from the last K positions of distinctElements, which are the most frequent elements.

This implementation efficiently groups the K most frequent elements without fully sorting the entire unique elements list, leveraging randomness for pivot selection which ensures good average performance. The final solution uses O(N) space due to the additional data structures and has an average time complexity of O(N) due to the quickselect process, though the worst case can degrade to O(N^2).

java
class Solution {
    int[] distinctElements;
    Map<Integer, Integer> frequencyMap;
    
    public void exchange(int idx1, int idx2) {
        int temp = distinctElements[idx1];
        distinctElements[idx1] = distinctElements[idx2];
        distinctElements[idx2] = temp;
    }
    
    public int rearrange(int low, int high, int pivotIdx) {
        int pivotFreq = frequencyMap.get(distinctElements[pivotIdx]);
        // Move pivot to the end
        exchange(pivotIdx, high);
        int storeIdx = low;
    
        // Place less frequent elements on the left
        for (int i = low; i <= high; i++) {
            if (frequencyMap.get(distinctElements[i]) < pivotFreq) {
                exchange(storeIdx, i);
                storeIdx++;
            }
        }
    
        // Move pivot to its correct position
        exchange(storeIdx, high);
    
        return storeIdx;
    }
        
    public void quickSelect(int left, int right, int k) {
        // Single element scenario
        if (left == right) return;
            
        // Choose a random pivot index
        Random rand = new Random();
        int pivotIdx = left + rand.nextInt(right - left);
    
        // Pivot placement
        pivotIdx = rearrange(left, right, pivotIdx);
    
        // Check pivot position
        if (k == pivotIdx) {
            return;    
        } else if (k < pivotIdx) {
            // Process left part
            quickSelect(left, pivotIdx - 1, k);     
        } else {
            // Process right part
            quickSelect(pivotIdx + 1, right, k);  
        }
    }
        
    public int[] topKFrequent(int[] nums, int k) {
        frequencyMap = new HashMap<>();
        for (int num : nums) {
            frequencyMap.put(num, frequencyMap.getOrDefault(num, 0) + 1);
        }
            
        // Extract unique elements
        int size = frequencyMap.size();
        distinctElements = new int[size]; 
        int idx = 0;
        for (int num : frequencyMap.keySet()) {
            distinctElements[idx++] = num;
        }
            
        // Select the (size - k)th element
        quickSelect(0, size - 1, size - k);
        // Retrieve most frequent elements
        return Arrays.copyOfRange(distinctElements, size - k, size);
    }
}

The Java code provided outlines an efficient solution for finding the top k frequent elements in an array using a combination of a frequency map, quickselect algorithm, and array manipulation.

  • Start by creating a frequency map (frequencyMap) to count occurrences of each element in the provided nums array.
  • Convert the frequency map keys to a distinct elements array (distinctElements) which holds unique elements from the input array.
  • Employ the quickselect algorithm to efficiently find the k most frequent elements. The quickselect routine helps in partially sorting the elements to ensure that the k most frequent elements end up at the end of the distinctElements array:
    • In quickselect, pick a pivot and reorganize the array such that elements less frequent than the pivot come before it, and more frequent elements after.
    • Depending upon the pivot's position and k, either recursively call quickselect on the right part (to include more frequent elements) or the left.
  • After determining the correct position through quickselect, the top k elements are found at the end of the distinctElements array. Use Arrays.copyOfRange() to extract and return these elements.

This method leverages the speed of the quickselect algorithm, optimized for finding the k highest elements without fully sorting the data, making it suitable for large datasets where efficiency is crucial.

python
from collections import Counter
import random
    
class FrequencySolution:
    def getTopKFrequent(self, elements: List[int], top_k: int) -> List[int]:
        frequency = Counter(elements)
        distinct_elements = list(frequency.keys())
            
        def partitionFunction(start, end, pivot_idx) -> int:
            pivot_freq = frequency[distinct_elements[pivot_idx]]
            distinct_elements[pivot_idx], distinct_elements[end] = distinct_elements[end], distinct_elements[pivot_idx]
                
            swap_idx = start
            for i in range(start, end):
                if frequency[distinct_elements[i]] < pivot_freq:
                    distinct_elements[swap_idx], distinct_elements[i] = distinct_elements[i], distinct_elements[swap_idx]
                    swap_idx += 1
    
            distinct_elements[end], distinct_elements[swap_idx] = distinct_elements[swap_idx], distinct_elements[end]
            return swap_idx
            
        def quickSelect(start, end, k_least) -> None:
            if start == end:
                return
            random_pivot_idx = random.randint(start, end)
            pivoted_idx = partitionFunction(start, end, random_pivot_idx)
    
            if k_least == pivoted_idx:
                return 
            elif k_least < pivoted_idx:
                quickSelect(start, pivoted_idx - 1, k_least)
            else:
                quickSelect(pivoted_idx + 1, end, k_least)
            
        num_elements = len(distinct_elements)
        quickSelect(0, num_elements - 1, num_elements - top_k)
        return distinct_elements[num_elements - top_k:]

The given Python code solves the problem of finding the top K frequent elements in a list using a combination of a counter to capture element frequencies and a quickselect algorithm to efficiently select the top K elements based on their frequencies.

  • Start by importing necessary modules:

    • Counter from collections to count the frequency of each element in the list.
    • random to choose random pivot points for the quickselect algorithm.
  • Define a class FrequencySolution, which includes a method getTopKFrequent with parameters elements (the list of integers) and top_k (the number of top frequent elements to retrieve).

  • Inside getTopKFrequent, utilize the Counter to create a frequency dictionary of the elements. Convert the keyset of this dictionary to a list distinct_elements, which contains the elements without their duplicates.

  • Define a nested partitionFunction to assist the quickselect process:

    • This function arranges elements around a pivot chosen based on frequency so that elements less or equal in frequency than the pivot are on the left, and those greater are on the right, then returning the partition index.
  • Implement quickSelect recursively:

    • Choose a random pivot index and call partitionFunction to portion the elements.
    • Depending on the pivot's position and the desired rank k_least, recursion either investigates the left or the right sublist to find the k-th positional element.
  • The main quickSelect action in getTopKFrequent is initialized to segregate the array such that the top K frequent elements are positioned on the rightmost side. These top elements are then sliced from the list and returned.

By harnessing a Quickselect algorithm, the solution performs efficiently, particularly favorable in scenarios where top_k is much smaller than the number of distinct elements in the array. Quickselect generally provides better performance for this problem type compared to sorting the entire frequency list, particularly when only a subset of the data (top K elements) is needed.

Comments

No comments yet.