387. First Unique Character in a String
387. First Unique Character in a String
Problem Statement
Given a string s, find the first non-repeating character in it and return its index. If it does not exist, return -1.
Examples
Example 1:
Input: s = "leetcode"
Output: 0
Explanation: The character 'l' at index 0 is the first character that does not repeat.
Example 2:
Input: s = "loveleetcode"
Output: 2
Explanation: The character 'v' at index 2 is the first character that does not repeat.
Example 3:
Input: s = "aabb"
Output: -1
Explanation: All characters repeat, so return -1.
Constraints
1 <= s.length <= 10^5sconsists of only lowercase English letters.
Clarification Questions
Before diving into the solution, here are 5 important clarifications and assumptions to discuss during an interview:
-
Unique definition: What makes a character unique? (Assumption: A character that appears exactly once in the string)
-
Return value: What should we return if no unique character exists? (Assumption: Return -1 - no unique character found)
-
Case sensitivity: Are characters case-sensitive? (Assumption: Based on constraints, only lowercase letters, so case doesn’t matter)
-
Index format: Should we return 0-indexed or 1-indexed position? (Assumption: 0-indexed - standard array indexing)
-
Character set: What characters can appear in the string? (Assumption: Only lowercase English letters ‘a’-‘z’ - 26 characters)
Interview Deduction Process (10 minutes)
Step 1: Brute-Force Approach (2 minutes)
For each character in the string, scan through the entire string to count how many times it appears. Return the index of the first character that appears exactly once. This approach has O(n²) time complexity where n is the string length, which is inefficient for large strings.
Step 2: Semi-Optimized Approach (3 minutes)
Use a hash map or frequency array to count the frequency of each character in one pass. Then, iterate through the string again to find the first character with frequency 1. This reduces time complexity to O(n) with O(k) space where k is the number of unique characters (26 for lowercase letters). This works well but can be optimized further for space.
Step 3: Optimized Solution (5 minutes)
Use bit manipulation to track character frequencies efficiently. Use two integers as bit masks: one to track characters that have appeared once, and another to track characters that have appeared multiple times. For each character, toggle its bit in the “once” mask. If the bit was already set, move it to the “multiple” mask. After processing, iterate through the string again and return the index of the first character whose bit is set in “once” but not in “multiple”. This achieves O(n) time with O(1) space using only two integers, making it the most space-efficient solution.
Solution Approach
This problem requires finding the first character that appears exactly once. There are several approaches:
- Bit Manipulation: Use bitsets to track characters seen once vs multiple times
- Count Array: Count occurrences, then find first character with count 1
- Hash Map: Similar to count array but using hash map
- Two-Pass: Count first, then scan for first unique
Key Insights:
- Frequency Counting: Track how many times each character appears
- Order Preservation: Need to maintain order to find “first” unique
- Bit Manipulation: Efficient way to track seen-once vs seen-multiple
- Early Exit: Can optimize by tracking first occurrence indices
Solution 1: Bit Manipulation (Efficient)
class Solution {
public:
int firstUniqChar(string s) {
int once = 0, multi = 0;
int idx = -1;
for(char ch : s) {
int bit = 1 << (ch - 'a');
multi |= once & bit;
once ^= bit;
once &= ~multi;
}
for(int i = 0; i < (int)s.length(); i++) {
int bit = 1 << (s[i] - 'a');
if(once & bit) {
return i;
}
}
return -1;
}
};
Algorithm Explanation:
- Initialize Bitsets:
once: Bits set for characters seen exactly oncemulti: Bits set for characters seen multiple times
- First Pass - Track Frequencies:
- For each character, calculate its bit position:
bit = 1 << (ch - 'a') - Update
multi:multi |= once & bit- if character already inonce, mark it inmulti - Toggle
once:once ^= bit- toggle the bit (XOR) - Remove from
once:once &= ~multi- remove characters that appear multiple times
- For each character, calculate its bit position:
- Second Pass - Find First Unique:
- Scan string from start
- Check if character’s bit is set in
once - Return first index where
once & bitis true
Example Walkthrough:
Input: s = "leetcode"
First Pass:
'l': bit = 1 << ('l' - 'a') = 1 << 11 = 2048
multi |= once & 2048 = 0 (once is 0)
once ^= 2048 → once = 2048
once &= ~multi = 2048 & ~0 = 2048
'e': bit = 1 << ('e' - 'a') = 1 << 4 = 16
multi |= once & 16 = 0 (once doesn't have bit 16)
once ^= 16 → once = 2048 | 16 = 2064
once &= ~multi = 2064 & ~0 = 2064
'e': bit = 16 (second 'e')
multi |= once & 16 = 2064 & 16 = 16 → multi = 16
once ^= 16 → once = 2048 (removed 'e')
once &= ~multi = 2048 & ~16 = 2048
... continue for all characters ...
After first pass: once contains bits for characters seen exactly once
Second Pass:
i=0, 'l': bit = 2048, once & 2048 = 2048 (set!) → return 0 ✓
Complexity Analysis:
- Time Complexity: O(n)
- First pass: O(n) to process all characters
- Second pass: O(n) to find first unique
- Total: O(n)
- Space Complexity: O(1)
- Only two integers (
onceandmulti) for bit manipulation - Constant space regardless of input size
- Only two integers (
Solution 2: Count Array with Index Tracking
class Solution {
public:
int firstUniqChar(string s) {
vector<int> cnt(26, 0);
vector<int> idx(26, -1);
for(int i = 0; i < (int)s.length(); i++) {
// log first idx
int curr = s[i] - 'a';
if(cnt[curr] == 0) {
idx[curr] = i;
}
cnt[curr]++;
}
int minIdx = INT_MAX;
for(int i = 0; i < 26; i++){
if(cnt[i] == 1) {
minIdx = min(minIdx, idx[i]);
}
}
return minIdx == INT_MAX ? -1: minIdx;
}
};
Algorithm Explanation:
- Initialize Arrays:
cnt[26]: Count occurrences of each characteridx[26]: Store first occurrence index of each character
- First Pass - Count and Track Indices:
- For each character at index
i:- Calculate character index:
curr = s[i] - 'a' - If first time seeing character (
cnt[curr] == 0), record index:idx[curr] = i - Increment count:
cnt[curr]++
- Calculate character index:
- For each character at index
- Second Pass - Find Minimum Index:
- Iterate through all 26 characters
- If character appears exactly once (
cnt[i] == 1):- Update minimum index:
minIdx = min(minIdx, idx[i])
- Update minimum index:
- Return
minIdx(or-1if no unique character)
Example Walkthrough:
Input: s = "loveleetcode"
First Pass:
i=0, 'l': curr=11, cnt[11]=0 → idx[11]=0, cnt[11]=1
i=1, 'o': curr=14, cnt[14]=0 → idx[14]=1, cnt[14]=1
i=2, 'v': curr=21, cnt[21]=0 → idx[21]=2, cnt[21]=1
i=3, 'e': curr=4, cnt[4]=0 → idx[4]=3, cnt[4]=1
i=4, 'l': curr=11, cnt[11]=1 → skip idx update, cnt[11]=2
i=5, 'e': curr=4, cnt[4]=1 → skip idx update, cnt[4]=2
... continue ...
After first pass:
cnt: ['e'=4, 'l'=2, 'o'=1, 't'=1, 'c'=1, 'd'=1, 'v'=1, ...]
idx: ['l'=0, 'o'=1, 'v'=2, 'e'=3, 't'=7, 'c'=8, 'd'=9, ...]
Second Pass:
Check all 26 characters:
'o': cnt[14]=1 → minIdx = min(INT_MAX, 1) = 1
'v': cnt[21]=1 → minIdx = min(1, 2) = 1
't': cnt[19]=1 → minIdx = min(1, 7) = 1
'c': cnt[2]=1 → minIdx = min(1, 8) = 1
'd': cnt[3]=1 → minIdx = min(1, 9) = 1
But wait, 'v' appears at index 2, which is before 'o' at index 1...
Actually, 'v' is at index 2, 'o' is at index 1, so 'o' comes first.
Wait, let me recalculate: 'l'=0, 'o'=1, 'v'=2, so 'v' is at index 2.
Actually, the correct answer should be 2 ('v'), but the algorithm finds 'o' at 1.
Let me check: 'o' appears only once? Yes. So minIdx = 1.
But the expected output is 2 for 'v'. Let me check the string again:
"loveleetcode" - positions: l=0, o=1, v=2, e=3, l=4, e=5, e=6, t=7, c=8, o=9, d=10, e=11
So 'o' appears at positions 1 and 9, so cnt['o'] = 2, not 1!
'v' appears only at position 2, so cnt['v'] = 1, idx['v'] = 2.
So minIdx should be 2 for 'v'. The algorithm is correct.
Complexity Analysis:
- Time Complexity: O(n)
- First pass: O(n) to count and track indices
- Second pass: O(26) = O(1) to find minimum
- Total: O(n)
- Space Complexity: O(1)
- Two arrays of size 26 (constant)
- O(1) space regardless of input size
Key Insights
- Bit Manipulation: Efficient way to track seen-once vs seen-multiple using XOR and AND operations
- Index Tracking: Store first occurrence index to find minimum later
- Two-Pass Approach: Count first, then find first unique
- Early Exit: Can optimize by scanning string in second pass instead of all 26 characters
Edge Cases
- All unique:
s = "abc"→ return0 - All duplicate:
s = "aabb"→ return-1 - Single character:
s = "a"→ return0 - Unique at end:
s = "aabbc"→ return4 - Unique in middle:
s = "aabcc"→ return2(‘b’) - Long string: All characters unique, return
0
Common Mistakes
- Wrong bit operations: Incorrect XOR/AND logic in bit manipulation
- Index confusion: Not tracking first occurrence correctly
- Off-by-one: Incorrect character to index conversion
- Not handling all duplicates: Forgetting to return
-1 - Wrong order: Returning last unique instead of first
Alternative Approaches
Simple Count Array (Standard)
class Solution {
public:
int firstUniqChar(string s) {
vector<int> count(26, 0);
for(char c : s) {
count[c - 'a']++;
}
for(int i = 0; i < s.length(); i++) {
if(count[s[i] - 'a'] == 1) {
return i;
}
}
return -1;
}
};
Pros: Simple, clear, easy to understand
Cons: Two passes through string (though still O(n))
Hash Map Approach
class Solution {
public:
int firstUniqChar(string s) {
unordered_map<char, int> count;
for(char c : s) {
count[c]++;
}
for(int i = 0; i < s.length(); i++) {
if(count[s[i]] == 1) {
return i;
}
}
return -1;
}
};
Pros: Works for any character set
Cons: Slightly more overhead than array
Comparison of Approaches
| Approach | Time | Space | Pros | Cons |
|---|---|---|---|---|
| Bit Manipulation | O(n) | O(1) | Most space-efficient, elegant | Complex logic, harder to understand |
| Count Array + Index | O(n) | O(1) | Clear logic, tracks indices | Two arrays needed |
| Simple Count Array | O(n) | O(1) | Simplest, most readable | Two passes through string |
| Hash Map | O(n) | O(k) | Flexible, works for any charset | More overhead |
When to Use Each Approach
- Bit Manipulation: When space optimization is critical, or demonstrating bit manipulation skills
- Count Array + Index: When you need to track both frequency and position efficiently
- Simple Count Array: When clarity and simplicity are priority (most common)
- Hash Map: When character set is unknown or not limited to lowercase letters
Related Problems
- LC 383: Ransom Note - Character frequency counting
- LC 389: Find the Difference - Find extra character
- LC 451: Sort Characters By Frequency - Sort by frequency
- LC 438: Find All Anagrams in a String - Character frequency matching
- LC 567: Permutation in String - Sliding window with frequency
This problem demonstrates efficient character frequency tracking using bit manipulation or count arrays. The bit manipulation approach is particularly elegant, using XOR to toggle bits and track characters seen once vs multiple times.