H-Index II

Updated on 02 June, 2025
H-Index II header image

Problem Statement

The h-index is a metric that helps to measure the impact and productivity of a researcher based on the number of citations their publications receive. The challenge presented involves determining a researcher's h-index given an array of integers where each element represents the number of citations a specific paper received, and the array is sorted in ascending order. Specifically, h-index is defined as the maximum value h such that the researcher has h papers with at least h citations each. The solution must compute this in logarithmic time, implying the need for an algorithm more efficient than a straightforward linear approach.

Examples

Example 1

Input:

citations = [0,1,3,5,6]

Output:

3

Explanation:

[0,1,3,5,6] means the researcher has 5 papers in total and each of them had received 0, 1, 3, 5, 6 citations respectively.
Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, their h-index is 3.

Example 2

Input:

citations = [1,2,100]

Output:

2

Constraints

  • n == citations.length
  • 1 <= n <= 105
  • 0 <= citations[i] <= 1000
  • citations is sorted in ascending order.

Approach and Intuition

  1. Understand that in a sorted array, the h-index can be effectively found using binary search due to the ordered nature of citations.
  2. Start by defining the binary search bounds: left = 0 and right = n - 1, where n is the number of papers (i.e., the length of the citations array).
  3. The middle of the current search range (mid) is used for evaluating whether it meets the h-index criteria, i.e., the number of papers n - mid should be at least citations[mid].
  4. If citations[mid] is less than n - mid, it implies not enough papers have at least citations[mid] citations. Hence, move the left bound up to mid + 1.
  5. If citations[mid] is greater than or equal to n - mid, it is a potential h-index but might not be the maximum. Therefore, move the right bound down to mid - 1 to explore if a larger h-index might exist.
  6. Continue this process till left exceeds right. At this point, n - left usually gives the h-index.
  • This method works efficiently due to the sorted nature of the array, allowing for the logarithmic search.
  • The approach leverages the relationship between the index of a citation count and the number of papers with at least that many citations, optimized through binary search to find the maximal point where these counts intersect or cross.

Solutions

  • Java
java
class Solution {
    public int calculateHIndex(int[] citationsArray) {
        int length = citationsArray.length;
        int start = 0, end = length - 1;

        while (start <= end) {
            int middle = start + (end - start) / 2;

            if (citationsArray[middle] == length - middle)
                return citationsArray[middle];
            
            if (citationsArray[middle] < length - middle)
                start = middle + 1;
            else
                end = middle - 1;
        }

        return length - start;
    }
}

The provided solution addresses the problem of calculating the H-Index, where the citations array is sorted in ascending order. The H-Index calculation is done through a binary search approach.

  • Initialize length to the size of the citationsArray.
  • Set start to 0 and end to length - 1.

Follow these steps to implement the solution:

  1. Run a loop until start is less than or equal to end.
  2. Calculate middle as the average of start and end, adjusting for integer division.
  3. Check whether the citation at the middle index equals the difference between length and middle. If true, immediately return the citation at the middle index as the H-Index.
  4. If the citation number at middle is less than its positional difference from length, adjust start to middle + 1, narrowing the search towards higher citations.
  5. If the condition above does not hold true, adjust end to middle - 1, reducing the search range to consider lesser citations.
  6. Once the loop exits, return length - start as the calculated H-Index.

This solution efficiently calculates the H-Index in logarithmic time complexity, utilizing the sorted nature of the citations array to narrow down the search space quickly. The approach ensures that every element is properly evaluated to determine if it qualifies as the H-Index without the need for a full array traversal.

Comments

No comments yet.