Find All Duplicates in an Array

Updated on 27 May, 2025
Find All Duplicates in an Array header image

Problem Statement

In this problem, you are provided with an array nums consisting of integers that range from 1 to n, where each integer can appear at most twice. The task is to identify all the integers that appear exactly twice in the array and return them in the form of an array.

The challenge stipulates two critical requirements for the solution:

  1. The time complexity of the solution should be O(n), meaning that the algorithm must be able to process the array in a time proportional to the length of the array.
  2. The space complexity for auxiliary (additional) space should be constant, i.e., it should not depend on the size of the input array, excluding the output space. This rules out solutions that require space proportional to n for storing intermediate data.

Examples

Example 1

Input:

nums = [4,3,2,7,8,2,3,1]

Output:

[2,3]

Example 2

Input:

nums = [1,1,2]

Output:

[1]

Example 3

Input:

nums = [1]

Output:

[]

Constraints

  • n == nums.length
  • 1 <= n <= 105
  • 1 <= nums[i] <= n
  • Each element in nums appears once or twice.

Approach and Intuition

Given the constraints, a naive approach that involves using a hash map to count occurrences of each number won't suffice due to the auxiliary space constraint. We must therefore consider methods that use the existing array nums for tracking repetitive elements.

The optimal approach is based on marking visited numbers in the array itself. The strategy works as follows:

  1. Traverse each element in the array.
  2. For each element, use its value as an index to access other positions in the array.
  3. When a position is accessed for the first time, negate the value at that position in the array to mark it as visited.
  4. If you encounter a position that has a negative number, this means the index (representing a number in the array) has been visited before, implying the number is a duplicate.

Through this mechanism, you can identify numbers that appear twice without needing extra space for tracking. Since each element is visited and marked at most twice, this method adheres to O(n) time complexity. Additionally, since no extra space is used aside from negating elements in the original array, the auxiliary space is constant.

Examples Breakdown:

  • In the first example, as you navigate the array [4,3,2,7,8,2,3,1], the values at the indices corresponding to 2 and 3 would be marked negative on the first visit, and identified as duplicates on the second encounter.
  • The second example demonstrates that the initial element itself can be a duplicate and is correctly identified.
  • In the third example, only one element exists, and it appears only once, which correctly results in an empty output.

This approach does not only meet the time and space complexity requirements but also cleverly uses the properties of the array and the value range to solve the problem efficiently.

Solutions

  • C++
  • Java
cpp
class Solution {
 public:
    vector<int> getDuplicates(vector<int>& elements) {
        vector<int> duplicates;

        for (int element : elements) {
            int index = abs(element) - 1;
            if (elements[index] < 0) {
                duplicates.push_back(abs(element));
            }
            elements[index] = -elements[index];
        }

        return duplicates;
    }
};

The solution provided in C++ efficiently identifies all duplicate elements within an array. Here's a brief overview of how this approach works:

  • The function getDuplicates accepts a reference to a vector of integers, which contains the elements of the array you are analyzing for duplicates.
  • A new vector duplicates is declared to store the results, i.e., the duplicate elements found in the input array.
  • Iterating through each element of the input array exploits a key observation: the array elements are used as indices to mark visited locations in the same array. This is accomplished by negating the value at the indexed position. If the value is already negative, this indicates a revisit, and hence, the element is a duplicate.
  • For each element in the array, calculate its corresponding index by subtracting one from its absolute value (to account for 0-based indexing).
  • Check if the value at the computed index is negative:
    • If yes, the abs value of the current element is added to the duplicates vector since this implies the index has been visited before, marking the element as a duplicate.
    • If no, negate the value at the index to mark it as visited.
  • Return the duplicates vector, which now contains all duplicate elements from the input array.

This method proves to be efficient because it operates in linear time, O(n), where n is the number of elements in the array, and requires no additional space for its operation, O(1), beyond the space needed for the result. This efficiency makes the solution well-suited for large datasets. Make sure to use this function when the input guarantees that all elements are within the range from 1 to n, where n is the size of the array.

java
class Solution {
    public List<Integer> identifyDuplicates(int[] array) {
        List<Integer> duplicates = new ArrayList<>();

        for (int value : array) {
            int index = Math.abs(value) - 1;
            if (array[index] < 0) { // already marked
                duplicates.add(Math.abs(value));
            }
            array[index] = -array[index];
        }

        return duplicates;
    }
}

This Java solution addresses the problem of finding all duplicates in an array using a space-efficient algorithm. The solution leverages the input array itself to detect duplicates by marking visited numbers. Here's how you execute this approach:

  1. Initialize an empty list to store the duplicates found in the array.
  2. Iterate through each number in the provided array.
  3. Calculate the index associated with the current number's absolute value (considering that indices are zero-based).
  4. Check if the number at the computed index is negative, which indicates the original number has been seen before, hence is a duplicate.
  5. If a duplicate is detected, add the absolute value of the current number to the list of duplicates.
  6. Mark the number at the computed index as visited by making it negative.

By the end of the loop, the list contains all the numbers that appear twice in the array. This method does not use additional space for another data structure, as it directly modifies the original array for marking purposes, exploiting the properties of indices and values for tracking seen numbers. Please note that the array elements should strictly be positive numbers for this method to work correctly. The solution ensures modifications are only sign-based, preserving the ability to extract the original values using absolute operations.

Comments

No comments yet.