
Problem Statement
The h-index is a metric that helps to measure the impact and productivity of a researcher based on the number of citations their publications receive. The challenge presented involves determining a researcher's h-index given an array of integers where each element represents the number of citations a specific paper received, and the array is sorted in ascending order. Specifically, h-index is defined as the maximum value h
such that the researcher has h
papers with at least h
citations each. The solution must compute this in logarithmic time, implying the need for an algorithm more efficient than a straightforward linear approach.
Examples
Example 1
Input:
citations = [0,1,3,5,6]
Output:
3
Explanation:
[0,1,3,5,6] means the researcher has 5 papers in total and each of them had received 0, 1, 3, 5, 6 citations respectively. Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, their h-index is 3.
Example 2
Input:
citations = [1,2,100]
Output:
2
Constraints
n == citations.length
1 <= n <= 105
0 <= citations[i] <= 1000
citations
is sorted in ascending order.
Approach and Intuition
- Understand that in a sorted array, the h-index can be effectively found using binary search due to the ordered nature of citations.
- Start by defining the binary search bounds:
left = 0
andright = n - 1
, wheren
is the number of papers (i.e., the length of thecitations
array). - The middle of the current search range (
mid
) is used for evaluating whether it meets the h-index criteria, i.e., the number of papersn - mid
should be at leastcitations[mid]
. - If
citations[mid]
is less thann - mid
, it implies not enough papers have at leastcitations[mid]
citations. Hence, move the left bound up tomid + 1
. - If
citations[mid]
is greater than or equal ton - mid
, it is a potential h-index but might not be the maximum. Therefore, move the right bound down tomid - 1
to explore if a larger h-index might exist. - Continue this process till
left
exceedsright
. At this point,n - left
usually gives the h-index.
- This method works efficiently due to the sorted nature of the array, allowing for the logarithmic search.
- The approach leverages the relationship between the index of a citation count and the number of papers with at least that many citations, optimized through binary search to find the maximal point where these counts intersect or cross.
Solutions
- Java
class Solution {
public int calculateHIndex(int[] citationsArray) {
int length = citationsArray.length;
int start = 0, end = length - 1;
while (start <= end) {
int middle = start + (end - start) / 2;
if (citationsArray[middle] == length - middle)
return citationsArray[middle];
if (citationsArray[middle] < length - middle)
start = middle + 1;
else
end = middle - 1;
}
return length - start;
}
}
The provided solution addresses the problem of calculating the H-Index, where the citations array is sorted in ascending order. The H-Index calculation is done through a binary search approach.
- Initialize
length
to the size of thecitationsArray
. - Set
start
to 0 andend
tolength - 1
.
Follow these steps to implement the solution:
- Run a loop until
start
is less than or equal toend
. - Calculate
middle
as the average ofstart
andend
, adjusting for integer division. - Check whether the citation at the
middle
index equals the difference betweenlength
andmiddle
. If true, immediately return the citation at themiddle
index as the H-Index. - If the citation number at
middle
is less than its positional difference fromlength
, adjuststart
tomiddle + 1
, narrowing the search towards higher citations. - If the condition above does not hold true, adjust
end
tomiddle - 1
, reducing the search range to consider lesser citations. - Once the loop exits, return
length - start
as the calculated H-Index.
This solution efficiently calculates the H-Index in logarithmic time complexity, utilizing the sorted nature of the citations array to narrow down the search space quickly. The approach ensures that every element is properly evaluated to determine if it qualifies as the H-Index without the need for a full array traversal.
No comments yet.