215. Kth Largest Element in an Array
215. Kth Largest Element in an Array
Problem Statement
Given an integer array nums and an integer k, return the kth largest element in the array.
Note that it is the kth largest element in the sorted order, not the kth distinct element.
Can you solve it without sorting?
Examples
Example 1:
Input: nums = [3,2,1,5,6,4], k = 2
Output: 5
Explanation: The 2nd largest element is 5.
Example 2:
Input: nums = [3,2,3,1,2,4,5,5,6], k = 4
Output: 4
Explanation: The 4th largest element is 4.
Constraints
1 <= k <= nums.length <= 10^5-10^4 <= nums[i] <= 10^4
Clarification Questions
Before diving into the solution, here are 5 important clarifications and assumptions to discuss during an interview:
-
Kth largest definition: How is “kth largest” defined? (Assumption: When sorted in descending order, the element at position k-1 (1-indexed kth largest))
-
Duplicate handling: How should we handle duplicate values? (Assumption: Duplicates are counted separately - if [3,2,3,1], 1st largest is 3, 2nd is 3, 3rd is 2)
-
Array modification: Can we modify the input array? (Assumption: Typically yes - QuickSelect modifies array, but should clarify)
-
K validity: Is k guaranteed to be valid? (Assumption: Yes - per constraints, 1 <= k <= nums.length)
-
Return value: Should we return the value or the index? (Assumption: Return the value - integer representing kth largest element)
Interview Deduction Process (20 minutes)
Step 1: Brute-Force Approach (5 minutes)
Initial Thought: “I need to find kth largest. Let me sort the array.”
Naive Solution: Sort array in descending order, return element at index k-1.
Complexity: O(n log n) time, O(1) space
Issues:
- O(n log n) time when O(n) is possible
- Sorts entire array when only need kth element
- Doesn’t leverage partial sorting
- Can be optimized
Step 2: Semi-Optimized Approach (7 minutes)
Insight: “I can use min-heap of size k to track k largest elements.”
Improved Solution: Use min-heap of size k. For each element, if heap size < k, add; else if element > heap top, replace top. Heap top is kth largest.
Complexity: O(n log k) time, O(k) space
Improvements:
- O(n log k) time is better than O(n log n)
- Heap efficiently maintains k largest
- O(k) space is better than O(n)
- Can optimize further
Step 3: Optimized Solution (8 minutes)
Final Optimization: “QuickSelect can achieve O(n) average time.”
Best Solution: QuickSelect (partition from quicksort) can achieve O(n) average time. Partition array, if pivot is at kth position, return it; otherwise recurse on appropriate partition.
Complexity: O(n) average, O(n²) worst case, O(1) space
Key Realizations:
- QuickSelect is optimal for kth element
- O(n) average time is best possible
- Min-heap is simpler but O(n log k)
- QuickSelect modifies array
Solution Approaches
There are several approaches to solve this problem:
- Min Heap: Maintain a min heap of size k, keeping only the k largest elements
- QuickSelect: Use partition-based selection algorithm (similar to quicksort)
- Sorting: Sort the array and return the kth element (O(n log n))
Approach 1: Min Heap (Recommended)
Time Complexity: O(n log k)
Space Complexity: O(k)
Use a min heap to maintain the k largest elements. When the heap size exceeds k, remove the smallest element.
Approach 2: QuickSelect
Time Complexity: O(n) average, O(n²) worst case
Space Complexity: O(1) (excluding recursion stack)
Use the partition algorithm from quicksort to find the kth largest element without fully sorting the array.
Solution 1: Min Heap
class Solution {
public:
int findKthLargest(vector<int>& nums, int k) {
priority_queue<int, vector<int>, greater<>> minHeap;
for(int num: nums) {
minHeap.push(num);
if(minHeap.size() > k) minHeap.pop();
}
return minHeap.top();
}
};
Algorithm Explanation:
- Initialize: Create a min heap using
priority_queuewithgreater<>comparator - Process Elements:
- Push each element into the min heap
- If heap size exceeds
k, pop the smallest element (maintains k largest)
- Return: The top element is the kth largest
Why Min Heap for K Largest?
- Min heap keeps the smallest element at the top
- By maintaining size
k, we keep theklargest elements - The top element is the smallest among the
klargest, which is thekthlargest overall
Complexity Analysis:
- Time: O(n log k) - Each of n elements is pushed/popped from a heap of size k
- Space: O(k) - Heap stores at most k elements
Detailed Walkthrough: Min Heap Example
Let’s trace through an example to understand how the min heap solution works:
Input: nums = [3, 2, 1, 5, 6, 4], k = 2 (find 2nd largest)
Goal: Maintain a min heap of size 2 containing the 2 largest elements.
Step-by-step execution:
-
Initialize:
minHeap = [](empty) - Process 3:
- Push 3:
minHeap = [3] - Size = 1 ≤ 2, no pop needed
- Heap:
[3](top: 3)
- Push 3:
- Process 2:
- Push 2:
minHeap = [2, 3](min heap: 2 at top) - Size = 2 ≤ 2, no pop needed
- Heap:
[2, 3](top: 2)
- Push 2:
- Process 1:
- Push 1:
minHeap = [1, 3, 2]→ heapify →[1, 2, 3] - Size = 3 > 2, pop smallest: remove 1
- Heap:
[2, 3](top: 2) - Note: We removed 1 (smallest), keeping the 2 largest so far: [2, 3]
- Push 1:
- Process 5:
- Push 5:
minHeap = [2, 3, 5]→ heapify →[2, 3, 5] - Size = 3 > 2, pop smallest: remove 2
- Heap:
[3, 5](top: 3) - Note: We removed 2 (smallest), keeping the 2 largest so far: [3, 5]
- Push 5:
- Process 6:
- Push 6:
minHeap = [3, 5, 6]→ heapify →[3, 5, 6] - Size = 3 > 2, pop smallest: remove 3
- Heap:
[5, 6](top: 5) - Note: We removed 3 (smallest), keeping the 2 largest so far: [5, 6]
- Push 6:
- Process 4:
- Push 4:
minHeap = [4, 5, 6]→ heapify →[4, 5, 6] - Size = 3 > 2, pop smallest: remove 4
- Heap:
[5, 6](top: 5) - Note: We removed 4 (smallest), keeping the 2 largest: [5, 6]
- Push 4:
- Result:
- Final heap:
[5, 6](top: 5) - Return
minHeap.top()= 5 ✓ (2nd largest)
- Final heap:
Visual representation:
After each step:
Step 1: [3] → Keep: [3]
Step 2: [2, 3] → Keep: [2, 3]
Step 3: [1, 2, 3] → pop 1 → Keep: [2, 3]
Step 4: [2, 3, 5] → pop 2 → Keep: [3, 5]
Step 5: [3, 5, 6] → pop 3 → Keep: [5, 6]
Step 6: [4, 5, 6] → pop 4 → Keep: [5, 6]
Final: Top of heap = 5 (2nd largest)
Key Insight:
- Min heap keeps the smallest element at the top
- By maintaining exactly
kelements, we keep theklargest elements - When size exceeds
k, we remove the smallest (which is the top) - After processing all elements, the top is the
kthlargest
Solution 2: QuickSelect
class Solution {
public:
int findKthLargest(vector<int>& nums, int k) {
const int N = nums.size();
return quickSelect(nums, 0, N - 1, N - k);
}
private:
int quickSelect(vector<int>& nums, int l, int r, int k) {
if(l == r) return nums[k];
int pivot = nums[l], i = l - 1, j = r + 1;
while(i < j) {
do i++; while(nums[i] < pivot);
do j--; while(nums[j] > pivot);
if(i < j)
swap(nums[i], nums[j]);
}
if(k <= j) return quickSelect(nums, l, j, k);
else return quickSelect(nums, j + 1, r, k);
}
};
Algorithm Explanation:
- Partition: Use Hoare’s partition scheme to partition around a pivot
- Recursive Selection:
- If kth element is in left partition, recurse on left
- Otherwise, recurse on right partition
- Base Case: When
l == r, we’ve found the kth element
How QuickSelect Works:
- Choose Pivot: Use first element as pivot (can be randomized for better average performance)
- Partition: Rearrange array so elements < pivot are on left, > pivot on right
- Recurse: Based on pivot position, recurse on the partition containing the kth element
Complexity Analysis:
- Time:
- Average: O(n) - Each partition eliminates roughly half the elements
- Worst case: O(n²) - Bad pivot selection (can be avoided with randomization)
- Space: O(1) excluding recursion stack (O(log n) average, O(n) worst case)
Key Insight:
- We’re looking for the element at position
N - kin sorted order (0-indexed) - QuickSelect finds the element at position
kwithout fully sorting - Only recurses on the partition containing the target element
Example Walkthrough
Input:
nums = [3, 2, 3, 1, 2, 4, 5, 5, 6]
k = 4
We search for index:
N - k = 9 - 4 = 5
First call:
quickSelect(nums, l=0, r=8, k=5)
pivot = nums[0] = 3
i = -1, j = 9 (initial)
Partition process:
- Move i right: i=0 (nums[0]=3 >= 3)
- Move j left: j=8 (nums[8]=6 > 3), j=7 (nums[7]=5 > 3), j=6 (nums[6]=5 > 3), j=5 (nums[5]=4 > 3), j=4 (nums[4]=2 <= 3)
- Swap nums[0] and nums[4]
- Continue: i=1 (nums[1]=2 < 3), i=2 (nums[2]=3 >= 3)
- Continue: j=3 (nums[3]=1 <= 3)
- Swap nums[2] and nums[3]
- Continue: i=3 (nums[3]=3 >= 3)
- Continue: j=2 (nums[2]=1 <= 3)
- Loop ends: i=3, j=2
After partition:
[2, 2, 1, 3, 3 | 4, 5, 5, 6]
l=0 j=2 i=3 r=8
l=0, r=8, j=2, k=5
k > j → search right side [j+1, r] = [3, 8]
Second call:
quickSelect(nums, l=3, r=8, k=5)
pivot = nums[3] = 3
i = 2, j = 9 (initial)
Partition process:
- Move i right: i=3 (nums[3]=3 >= 3)
- Move j left: j=8 (nums[8]=6 > 3), j=7 (nums[7]=5 > 3), j=6 (nums[6]=5 > 3), j=5 (nums[5]=4 > 3), j=4 (nums[4]=3 <= 3)
- Loop ends: i=3, j=4
After partition:
[2, 2, 1, 3, 3 | 4, 5, 5, 6]
l=3 j=4 i=5 r=8
l=3, r=8, j=4, k=5
k > j → search right side [j+1, r] = [5, 8]
Third call:
quickSelect(nums, l=5, r=8, k=5)
pivot = nums[5] = 4
i = 4, j = 9 (initial)
Partition process:
- Move i right: i=5 (nums[5]=4 >= 4)
- Move j left: j=8 (nums[8]=6 > 4), j=7 (nums[7]=5 > 4), j=6 (nums[6]=5 > 4), j=5 (nums[5]=4 <= 4)
- Loop ends: i=5, j=5
After partition:
[2, 2, 1, 3, 3, 4 | 5, 5, 6]
l=5 j=5 i=6 r=8
l=5, r=8, j=5, k=5
k <= j → search left side [l, j] = [5, 5]
Base case:
quickSelect(nums, l=5, r=5, k=5)
l == r == 5
return nums[5] = 4
✅ Answer = 4
Runtime Analysis
Why QuickSelect is O(n) Average Case:
- Each Partition Step is O(m) where m is the subarray size:
- First call: Partitions 9 elements → O(9) time
- Second call: Partitions 6 elements → O(6) time
- Third call: Partitions 4 elements → O(4) time
- Each partition makes a single pass through the subarray using two pointers (i and j)
- Time per partition = size of subarray being partitioned
- Why the total is O(n) and not O(n²):
- Key insight: We only recurse on ONE partition (the one containing k), not both
- On average, each partition eliminates roughly half the elements
- The subarray sizes form a geometric series: n + n/2 + n/4 + n/8 + …
- Geometric series sum: n × (1 + 1/2 + 1/4 + 1/8 + …) = n × 2 = O(n)
- In our example: 9 + 6 + 4 = 19 ≈ 2n (where n=9)
- Why worst case is O(n²):
- If pivot is always the smallest/largest element, we eliminate only 1 element per partition
- Subarray sizes: n + (n-1) + (n-2) + … = n(n+1)/2 = O(n²)
- Can be avoided by randomizing the pivot selection
- Space Complexity:
- O(1) extra space per recursive call (only storing l, r, k, pivot, i, j)
- Recursion stack: O(log n) average depth, O(n) worst case
- Total space: O(log n) average, O(n) worst case
Comparison of Approaches
| Approach | Time Complexity | Space Complexity | When to Use |
|---|---|---|---|
| Min Heap | O(n log k) | O(k) | When k is small, need streaming solution |
| QuickSelect | O(n) avg, O(n²) worst | O(1) | When k is large, need O(1) space |
| Sorting | O(n log n) | O(1) | Simple but less efficient |
Key Insights
- Min Heap Pattern: Use min heap to maintain k largest elements
- QuickSelect: Partition-based selection avoids full sorting
- Index Conversion: kth largest = (n-k)th smallest in sorted array
- Space Trade-off: Heap uses O(k) space, QuickSelect uses O(1) space
Follow-up Questions
- What if we need to handle dynamic updates (add/remove elements)?
- How would you optimize for very large datasets that don’t fit in memory?
- What if we need the k largest elements in sorted order?
Related Problems
- LC 347: Top K Frequent Elements - Similar heap pattern
- LC 973: K Closest Points to Origin - K selection with custom comparator
- LC 703: Kth Largest Element in a Stream - Dynamic version
- LC 378: Kth Smallest Element in a Sorted Matrix - 2D variant
Implementation Notes
- Min Heap: Use
priority_queue<int, vector<int>, greater<>>for min heap - QuickSelect: Use Hoare’s partition for better performance
- Randomization: For QuickSelect, randomize pivot to avoid worst case
- Edge Cases: Handle k = 1, k = n, and single element arrays
This problem demonstrates two fundamental approaches: heap-based selection and partition-based selection. The choice depends on constraints and requirements.