287. Find the Duplicate Number - Solutions and Explanation

Problem Statement

The task is to identify a repeated number in a given array nums, which consists of n + 1 integers. Each of these integers lies within the inclusive range from 1 to n. The unique feature of this array is that while all integers appear exactly once, there exists a single integer that repeats two or more times. This recurring integer is the output that needs to be determined.

The key challenge is to find this repeating number under the constraints of avoiding any modifications to the array nums and utilizing only constant extra space during the solution process. This specific requirement prohibits the use of additional data structures that scale with the size of the input, making traditional frequency counting techniques infeasible.

Examples

Example 1

Input:

nums = [1,3,4,2,2]

Output:

Example 2

Input:

nums = [3,1,3,4,2]

Output:

Example 3

Input:

nums = [3,3,3,3,3]

Output:

Constraints

1 <= n <= 105
nums.length == n + 1
1 <= nums[i] <= n
All the integers in nums appear only once except for precisely one integer which appears two or more times.

Approach and Intuition

The constraints and requirements suggest a need for an algorithm that is efficient not just in terms of time but also in how it uses memory. Given the nature of the problem, several insights and approaches are possible:

Using Floyd's Tortoise and Hare (Cycle Detection): This approach is inspired by cycle detection in linked lists. Since one number repeats, we can think of the array like a linked list where each position points to the position at the index of its value. This setup inherently forms a cycle due to the repeated number:
- Start with two pointers, the slow pointer progresses one step at a time (slow = nums[slow]), and the fast pointer which advances two steps at a time (fast = nums[nums[fast]]).
- The first meeting point of these two pointers will be inside the cycle. The start of the cycle, which corresponds to the entry of the repeated number, can be found by initializing another pointer at the beginning and moving both this pointer and the slow pointer one step at a time. Their meeting point will be at the start of the cycle — the repeated number.

The Floyd's Tortoise and Hare method minimizes space usage while ensuring a polynomial time complexity. It effectively leverages the mathematical properties of cycles to solve the problem under stringent space constraints. By relying on the inherent structure formed by the indices and their values, this method cleverly sidesteps the need for additional data structures.

Solutions

C++
Java
Python

                            cpp
                            
                        
class Solution {
public:
    int detectDuplicate(vector<int>& arr) {
        int slow = arr[0];
        int fast = arr[0];
            
        do {
            slow = arr[slow];
            fast = arr[arr[fast]];
        } while (slow != fast);
            
        slow = arr[0];
        while (slow != fast) {
            slow = arr[slow];
            fast = arr[fast];
        }
            
        return fast;
    }
};

The solution uses the Floyd's Tortoise and Hare algorithm to find the duplicate number in an array. The process involves using two pointers, slow and fast, both initially set to the start of the array. The algorithm works as follows:

Initiate the slow pointer to the first element and the fast pointer also to the first element.
Advance the slow pointer by one step and the fast pointer by two steps until they meet. This loop continues until the two pointers point to the same element, indicating a cycle in the array.
Reset the slow pointer back to the start of the array.
Move both the slow and fast pointers one step at a time until they meet again.

The position at which they meet the second time is the duplicate number in the array. This method is efficient and requires O(n) time complexity with O(1) space complexity, since no extra space for data storage is necessary beyond the function calls.

                            java
                            
                        
class Solution {
    public int detectDuplicate(int[] elements) {
            
        int slow = elements[0];
        int fast = elements[0];
            
        do {
            slow = elements[slow];
            fast = elements[elements[fast]];
        } while (slow != fast);
    
        slow = elements[0];
            
        while (slow != fast) {
            slow = elements[slow];
            fast = elements[fast];
        }
    
        return fast;
    }
}

The provided Java solution is designed to find a duplicate number in an array where each integer is between 1 and n-1, with only one duplicate number present. The method detectDuplicate employs the Floyd's Tortoise and Hare algorithm to identify the duplicate. Here’s how the code works:

Initialize two pointers, slow and fast, at the starting element of the array.
Use a do-while loop to advance the slow pointer by one step and the fast pointer by two steps. This continues until both pointers meet, indicating a cycle which confirms a duplicate.
Reset the slow pointer to the start of the array. Then, move both pointers step-by-step at the same pace. The point at which they meet again is the duplicate element.

This approach ensures efficient detection of the duplicate with O(n) time complexity and O(1) space complexity, since it does not require extra storage for the data processing.

                            python
                            
                        
class Solution:
    def findRepeating(self, arr):
        slow = fast = arr[0]
        while True:
            slow = arr[slow]
            fast = arr[arr[fast]]
            if slow == fast:
                break
            
        slow = arr[0]
        while slow != fast:
            slow = arr[slow]
            fast = arr[fast]
            
        return fast

The solution provided solves the problem of finding the duplicate number in an array where numbers range from 1 to n and one number is repeated. The approach utilized here implements the Floyd's Tortoise and Hare cycle detection algorithm, which is efficient for such use cases. The method outlined involves two major phases:

Identifying a cycle within the sequence induced by the array:
- Establish two pointers, slow and fast, both starting at the first element.
- Move slow by one step and fast by two steps (i.e., slow = arr[slow], fast = arr[arr[fast]]) until they meet. This meeting point confirms a cycle indicating the repetition of a number.
Finding the entrance to this cycle which is the duplicate number:
- Reset the slow pointer back to the start of the array.
- Advance both slow and fast at the same pace, one step at a time.
- The point where they meet again determines the duplicate number, which is returned by the function.

This implementation is particularly memory efficient as it does not require additional space proportional to the size of the input array. The time complexity mainly revolves around the O(n) required to traverse the array multiple times—a linear relationship with the number of elements.

Comments

No comments yet.

Find the Duplicate Number

Problem Statement

Examples

Example 1

Example 2

Example 3

Constraints

Approach and Intuition

Solutions

Comments

Products

Features

Solutions

Marketplace

Resources

Company

Tech Talks

Vultr Blogs