[Medium] 274. H-Index
Given an array of integers citations where citations[i] is the number of citations a researcher received for their i-th paper, return the researcher’s h-index.
The h-index is defined as: the maximum value of h such that the researcher has published at least h papers that have each been cited at least h times.
Examples
Example 1:
Input: citations = [3, 0, 6, 1, 5]
Output: 3
Sort descending: [6, 5, 3, 1, 0]
i (papers seen) citations[i] >= i+1 ?
1 6 >= 1 YES
2 5 >= 2 YES
3 3 >= 3 YES
4 1 >= 4 NO ← stop
Answer = 3 (3 papers with >= 3 citations each)
Example 2:
Input: citations = [1, 3, 1]
Output: 1
Constraints
n == citations.length1 <= n <= 50000 <= citations[i] <= 1000
Thinking Process
What Are We Really Looking For?
This is not about total citations. It’s about finding a balance point: the largest h where at least h papers have >= h citations.
Why Sorting Works
After sorting in descending order:
- Position
i+1= number of papers we’ve seen so far - Value
citations[i]= the citation count of the current paper
At each position, we’re asking: “Does this paper have enough citations to support the current count of papers?”
The moment citations[i] < i + 1, the balance breaks and we’ve found our answer.
Sorted: [6, 5, 3, 1, 0]
Index 0: 6 >= 1 → at least 1 paper with >= 1 citations ✓
Index 1: 5 >= 2 → at least 2 papers with >= 2 citations ✓
Index 2: 3 >= 3 → at least 3 papers with >= 3 citations ✓
Index 3: 1 >= 4 → at least 4 papers with >= 4 citations ✗ ← STOP
h = 3
Can We Do Better Than $O(n \log n)$?
Yes – since h can be at most n, we can use counting sort to get $O(n)$.
Solution 1: Sort Descending – $O(n \log n)$ time, $O(1)$ space
class Solution {
public:
int hIndex(vector<int>& citations) {
sort(citations.begin(), citations.end(), greater<int>());
for (int i = 0; i < citations.size(); ++i) {
if (citations[i] < i + 1) return i;
}
return citations.size();
}
};
If every paper has enough citations, the loop finishes without returning, and h = n.
Solution 2: Counting Sort – $O(n)$ time, $O(n)$ space
class Solution {
public:
int hIndex(vector<int>& citations) {
int n = citations.size();
vector<int> count(n + 1, 0);
for (int c : citations) {
count[min(c, n)]++;
}
int total = 0;
for (int h = n; h >= 0; --h) {
total += count[h];
if (total >= h) return h;
}
return 0;
}
};
How It Works
-
Build a frequency array
count[i]= number of papers with exactlyicitations. Papers with>= ncitations are bucketed intocount[n]sincehcan never exceedn. -
Scan from
h = ndown to 0, accumulating the total number of papers with>= hcitations. The firsthwheretotal >= his the answer.
citations = [3, 0, 6, 1, 5], n = 5
count: index 0 1 2 3 4 5
value 1 1 0 1 0 2 (6 and 5 both go into bucket 5)
Scan from h=5: total=2, 2 >= 5? no
h=4: total=2, 2 >= 4? no
h=3: total=3, 3 >= 3? YES → return 3
Comparison
| Aspect | Sorting | Counting Sort |
|---|---|---|
| Time | $O(n \log n)$ | $O(n)$ |
| Space | $O(1)$ (in-place sort) | $O(n)$ (frequency array) |
| Interview preference | Most common, easy to explain | Good follow-up for optimization |
| Key idea | Descending order gives “papers seen” naturally | Bucket citations, scan from top |
Common Mistakes
- Confusing h-index with max citations:
his bounded by the number of papers, not the citation values - Off-by-one in sorting approach: Position
imeansi + 1papers (0-indexed), so the check iscitations[i] < i + 1 - Forgetting the
min(c, n)clamp in counting sort: Any citation count>= nis equivalent sinceh <= n
Key Takeaways
- The h-index is a balance point between quantity (number of papers) and quality (citation count)
- Sort descending and scan gives the most intuitive solution
- Counting sort exploits the fact that
h <= nto achieve linear time - This pattern of “find the largest k satisfying a threshold” appears in many problems
Related Problems
- 275. H-Index II – sorted input, use binary search for $O(\log n)$
- 287. Find the Duplicate Number – counting / pigeonhole
- 169. Majority Element – finding a threshold in an array