208. Implement Trie (Prefix Tree)
208. Implement Trie (Prefix Tree)
Problem Statement
A trie (pronounced as “try”) or prefix tree is a tree data structure used to efficiently store and retrieve keys in a dataset of strings. There are various applications of this data structure, such as autocomplete and spellchecker.
Implement the Trie class:
Trie()Initializes the trie object.void insert(String word)Inserts the stringwordinto the trie.boolean search(String word)Returnstrueif the stringwordis in the trie (i.e., was inserted before), andfalseotherwise.boolean startsWith(String prefix)Returnstrueif there is a previously inserted stringwordthat has the prefixprefix, andfalseotherwise.
Examples
Example 1:
Input
["Trie", "insert", "search", "search", "startsWith", "insert", "search"]
[[], ["apple"], ["apple"], ["app"], ["app"], ["app"], ["app"]]
Output
[null, null, true, false, true, null, true]
Explanation
Trie trie = new Trie();
trie.insert("apple");
trie.search("apple"); // return True
trie.search("app"); // return False
trie.startsWith("app"); // return True
trie.insert("app");
trie.search("app"); // return True
Constraints
1 <= word.length, prefix.length <= 2000wordandprefixconsist only of lowercase English letters.- At most
3 * 10^4calls in total will be made toinsert,search, andstartsWith.
Clarification Questions
Before diving into the solution, here are 5 important clarifications and assumptions to discuss during an interview:
-
Character set: What characters can appear in words? (Assumption: Only lowercase English letters ‘a’-‘z’ - 26 characters)
-
Empty string: How should we handle empty strings for
searchandstartsWith? (Assumption: Based on constraints, length is at least 1, but we should clarify if empty string is considered a valid word) -
Duplicate insertion: What happens if we insert the same word multiple times? (Assumption: The word should still be searchable - we can mark it as a word each time or just once)
-
Prefix vs exact match: For
startsWith, should it return true for the exact word itself? (Assumption: Yes - a word is considered to start with itself) -
Memory constraints: Are there any space complexity requirements? (Assumption: O(ALPHABET_SIZE * N * M) where N is number of words and M is average length)
Interview Deduction Process (20 minutes)
Step 1: Brute-Force Approach (5 minutes)
Store all words in a list or set. For search, check if word exists in the list. For startsWith, iterate through all words and check if any starts with the prefix. This approach has O(n × m) time for search and O(n × m) for startsWith where n is number of words and m is average length, which is too slow for many operations.
Step 2: Semi-Optimized Approach (7 minutes)
Use a hash set for exact word search (O(1) average), but for prefix search, still need to iterate through all words. Alternatively, maintain a sorted list and use binary search for prefix matching, but this still requires checking multiple words. The challenge is efficiently finding all words with a given prefix.
Step 3: Optimized Solution (8 minutes)
Use a Trie (prefix tree) data structure. Each node represents a character, and paths from root to nodes represent prefixes. Insert: traverse/create path character by character, mark end nodes as complete words. Search: traverse path, check if end node is marked as word. StartsWith: traverse path, check if path exists. This achieves O(m) time per operation where m is word/prefix length, and O(ALPHABET_SIZE × total_characters) space. The key insight is that Trie naturally organizes words by their prefixes, allowing efficient prefix-based operations without scanning all words.
Solution Approach
A Trie (Prefix Tree) is a tree-like data structure where each node represents a character. The path from root to any node represents a prefix, and nodes can be marked to indicate the end of a word.
Key Components:
- TrieNode: Each node has:
- 26 children pointers (one for each lowercase letter)
- A boolean flag
isEndto mark end of word
- Operations:
- Insert: Traverse/create path for each character, mark end node
- Search: Traverse path, check if exists and is marked as end
- StartsWith: Traverse path, check if path exists (regardless of end flag)
- Memory Management: Proper destructor to clean up allocated nodes
Algorithm:
- Insert: For each character, create node if missing, traverse, mark end
- Search: Traverse path, return true only if path exists AND end is marked
- StartsWith: Traverse path, return true if path exists (end flag not required)
Solution
class Trie {
public:
Trie() : children(26), isWord(false) {
}
void insert(string word) {
Trie* node = this;
for(char ch: word) {
ch -= 'a';
if(!node->children[ch]) {
node->children[ch] = new Trie();
}
node = node->children[ch];
}
node->isWord = true;
}
bool search(string word) {
Trie* node = this->searchPrefix(word);
return node != nullptr && node->isWord;
}
bool startsWith(string prefix) {
return this->searchPrefix(prefix) != nullptr;
}
private:
vector<Trie*> children;
bool isWord;
Trie* searchPrefix(string prefix) {
Trie* node = this;
for(char ch: prefix) {
ch -= 'a';
if(!node->children[ch]) {
return nullptr;
}
node = node->children[ch];
}
return node;
}
};
/**
* Your Trie object will be instantiated and called as such:
* Trie* obj = new Trie();
* obj->insert(word);
* bool param_2 = obj->search(word);
* bool param_3 = obj->startsWith(prefix);
*/
Algorithm Explanation:
Trie Class Structure:
This implementation uses a self-referential design where each Trie object represents a node in the trie. The root is the Trie object itself (this).
- Data Members:
vector<Trie*> children: Array of 26 pointers (one for each lowercase letter)bool isWord: Flag marking if this node represents the end of a word
- Constructor:
- Initializes
childrenvector with 26nullptrelements - Sets
isWordtofalse
- Initializes
- insert(word):
- Start from
this(root node) - For each character in word:
- Convert to index:
ch -= 'a' - Create new
Trienode if child doesn’t exist - Move to child node
- Convert to index:
- Mark final node’s
isWordastrue
- Start from
- search(word):
- Use
searchPrefixhelper to find the node - Return
trueonly if node exists ANDisWordistrue
- Use
- startsWith(prefix):
- Use
searchPrefixhelper to find the node - Return
trueif node exists (regardless ofisWordflag)
- Use
- searchPrefix(prefix) (Private Helper):
- Start from
this(root) - Traverse following each character in prefix
- Return
nullptrif path doesn’t exist - Return the final node if path exists
- Start from
Example Walkthrough:
Operations: insert("apple"), search("app"), startsWith("app"), insert("app"), search("app")
Step 1: insert("apple")
this (root) -> children['a'] -> children['p'] -> children['p'] -> children['l'] -> children['e']
Final node: isWord = true
Step 2: search("app")
searchPrefix("app"): Traverse this -> 'a' -> 'p' -> 'p'
Node exists but isWord = false
Return: false ✓
Step 3: startsWith("app")
searchPrefix("app"): Traverse this -> 'a' -> 'p' -> 'p'
Node exists (regardless of isWord flag)
Return: true ✓
Step 4: insert("app")
Traverse existing path: this -> 'a' -> 'p' -> 'p'
Set isWord = true on the 'p' node
Step 5: search("app")
searchPrefix("app"): Traverse this -> 'a' -> 'p' -> 'p'
Node exists AND isWord = true
Return: true ✓
Complexity Analysis:
- Time Complexity:
insert(word): O(m) where m = word lengthsearch(word): O(m) where m = word lengthstartsWith(prefix): O(p) where p = prefix length- All operations are linear in the length of the input string
- Space Complexity:
- Worst Case: O(ALPHABET_SIZE × N × M)
- ALPHABET_SIZE = 26 (lowercase letters)
- N = number of words
- M = average word length
- Best Case (shared prefixes): O(ALPHABET_SIZE × M × N)
- Prefixes are shared, reducing space
- In practice, space depends on how many unique prefixes exist
- Worst Case: O(ALPHABET_SIZE × N × M)
Key Insights
- Self-Referential Design: Each
Trieobject is itself a node, making the structure simpler - Trie Structure: Tree where each path represents a prefix/word
- End Marker (
isWord): Critical to distinguish between prefix and complete word - Shared Prefixes: Multiple words sharing prefixes share nodes (space efficient)
- Array vs Map:
vector<Trie*>(26 elements) is faster and uses less memory thanunordered_map - Character Indexing:
ch -= 'a'converts character to 0-25 index efficiently - Helper Method:
searchPrefixreduces code duplication betweensearchandstartsWith - Root as
this: The Trie object itself serves as the root, eliminating need for separate root pointer
Edge Cases
- Empty string: Should be handled (mark root as end if needed)
- Single character: Works normally
- Duplicate insert: Same word inserted twice (safe, just re-marks end)
- Prefix of existing word:
insert("apple"), theninsert("app")works - Word extends prefix:
insert("app"), theninsert("apple")works - Non-existent search: Returns
falsecorrectly - Non-existent prefix:
startsWithreturnsfalsecorrectly
Common Mistakes
- Forgetting
isWordcheck:searchreturns true for prefixes if not checkingisWord - Not checking
isWordin search:search("app")returns true when only “apple” exists - Character indexing error: Forgetting
ch -= 'a'before accessingchildren[ch] - Index out of bounds: Not validating character is lowercase (0-25 range)
- Null pointer access: Not checking
if(!node->children[ch])before accessing - Incorrect traversal: Not updating
node = node->children[ch]in loop - Case sensitivity: Assuming only lowercase (problem constraint)
- Using
thisincorrectly: Forgetting that root isthisitself, not a separate pointer
Alternative Approaches
Using Separate TrieNode Class (with Memory Management)
class TrieNode {
public:
TrieNode* links[26];
bool isEnd;
TrieNode() {
for(int i = 0; i < 26; i++) links[i] = nullptr;
isEnd = false;
}
};
class Trie {
TrieNode* root;
public:
Trie() { root = new TrieNode(); }
~Trie() { deleteSubtree(root); }
// ... insert, search, startsWith with helper methods
void deleteSubtree(TrieNode* node) {
if(!node) return;
for(int i = 0; i < 26; i++) deleteSubtree(node->links[i]);
delete node;
}
};
Pros: Better encapsulation, explicit memory management with destructor
Cons: More code, requires separate node class
Using unordered_map for Children
class Trie {
unordered_map<char, Trie*> children;
bool isWord;
// ... rest of implementation
};
Pros: Supports any character set, more flexible
Cons: More memory overhead, slightly slower lookups than array
Note on Memory Management
The current implementation doesn’t include a destructor. For production code, consider adding:
~Trie() {
for(Trie* child : children) {
if(child) delete child;
}
}
However, for LeetCode problems, this is often omitted for simplicity.
When to Use Trie
- Autocomplete: Fast prefix matching
- Spell Checker: Dictionary lookup
- IP Routing: Longest prefix matching
- Search Engines: Prefix-based search
- Phone Directory: Name/contact lookup
- Word Games: Valid word checking
Related Problems
- LC 211: Design Add and Search Words Data Structure - Trie with wildcard search
- LC 212: Word Search II - Trie + DFS on board
- LC 642: Design Search Autocomplete System - Trie with frequency tracking
- LC 648: Replace Words - Trie for prefix replacement
- LC 677: Map Sum Pairs - Trie with value storage
- LC 720: Longest Word in Dictionary - Trie traversal
This problem is a fundamental data structure implementation that demonstrates the Trie (Prefix Tree) structure. It’s essential for understanding prefix-based string operations and is widely used in autocomplete systems and search engines.