C++ std::set Guide: Ordered Container Essentials
C++ std::set Guide: Ordered Container Essentials
Everything you need to work effectively with std::set: construction patterns, common methods, idiomatic lookups, and answers to “why no contains()?” and “where are top()/back()?”.
1. Overview
std::set is an ordered container implemented as a balanced binary search tree (typically red–black). It stores unique keys sorted by < (or a custom comparator).
Key characteristics:
- Automatic ordering
- Logarithmic insert/erase/find
- Stable iterators (as long as the pointed element is not erased)
- No random access; iterators are bidirectional only
#include <set>
#include <iostream>
int main() {
std::set<int> s = {3, 1, 4, 1, 5};
for (int x : s) {
std::cout << x << " "; // 1 3 4 5
}
}
2. Construction & Basic Usage
#include <set>
void basics() {
std::set<std::string> names; // empty
std::set<int> nums = {5, 2, 9, 2}; // duplicates removed
std::set<int, std::greater<int>> desc = {1,2,3}; // custom comparator
// Range construction
std::vector<int> data = {4, 1, 4, 2};
std::set<int> from_vec(data.begin(), data.end()); // {1,2,4}
}
3. Core Member Functions
Insertion
void insert_examples() {
std::set<int> s;
s.insert(3);
s.insert({1, 4, 1}); // initializer list
auto [it, inserted] = s.insert(2);
if (!inserted) {
// value already existed
}
s.emplace(6); // construct in-place (log n)
}
Erase
void erase_examples() {
std::set<int> s = {1, 3, 5, 7};
s.erase(3); // by key
auto it = s.find(5);
if (it != s.end()) {
s.erase(it); // by iterator
}
s.erase(s.begin(), s.end()); // by range
}
Lookup
void lookup_examples() {
std::set<int> s = {2, 4, 6};
auto it = s.find(4); // iterator or end()
bool exists = (it != s.end());
size_t cnt = s.count(5); // 0 or 1
if (cnt > 0) { /* value present */ }
auto lb = s.lower_bound(3); // first >= 3
auto ub = s.upper_bound(3); // first > 3
}
4. Where Is contains()?
- C++20 and later:
std::setdoes includecontains(const Key&)returningbool. - Pre-C++20 (or when targeting older compilers): rely on
count()orfind().
std::set<int> s = {1, 2, 3};
#if __cpp_lib_erase_if >= 202002
bool has_two = s.contains(2); // C++20
#else
bool has_two = (s.count(2) > 0); // Pre-C++20 equivalent
#endif
count() is O(log n) and returns either 0 or 1 because std::set stores unique keys.
5. Why No top() or back()?
std::setis not a sequence container, so it doesn’t expose random-access operations likeoperator[],back(), ortop().- To access the smallest or largest element:
std::set<int> s = {3, 1, 4};
int smallest = *s.begin(); // first (minimum)
int largest = *s.rbegin(); // last (maximum)
// Always ensure set is non-empty before dereferencing
if (!s.empty()) {
std::cout << *s.begin() << " " << *s.rbegin();
}
Need heap-like top()? Use std::priority_queue. Need back()? Use std::vector/std::deque.
6. Iteration Patterns
void iterate(const std::set<int>& s) {
for (int value : s) {
// ascending order
}
for (auto it = s.rbegin(); it != s.rend(); ++it) {
// descending order
}
}
Iterators stay valid unless their element is erased. Operations that insert/erase different elements do not invalidate existing iterators.
7. Typical Use Cases
Deduplicating While Keeping Order
std::vector<int> data = {4, 1, 4, 2, 1};
std::set<int> unique(data.begin(), data.end()); // {1,2,4}
Ordered Set for Sliding-Window Median
std::multiset<int> window;
window.insert(value);
auto mid = std::next(window.begin(), window.size()/2);
Membership with Custom Comparator
struct CaseInsensitive {
bool operator()(const std::string& a, const std::string& b) const {
return std::lexicographical_compare(
a.begin(), a.end(), b.begin(), b.end(),
[](char x, char y) { return std::tolower(x) < std::tolower(y); });
}
};
std::set<std::string, CaseInsensitive> keywords = {"Allow", "Deny"};
8. Runtime Complexity Analysis
Understanding the time and space complexity of std::set operations is essential for choosing the right container and optimizing performance.
Time Complexity
| Operation | Time Complexity | Notes |
|---|---|---|
| Element Access | ||
begin(), end(), rbegin(), rend() |
O(1) | Iterator creation |
*begin(), *rbegin() |
O(1) | Access to min/max element |
| Lookup | ||
find(), count(), contains() (C++20) |
O(log n) | Binary search in balanced tree |
lower_bound(), upper_bound(), equal_range() |
O(log n) | Binary search operations |
| Modifiers | ||
insert() (single element) |
O(log n) | Tree insertion |
insert() (hint) |
O(1) amortized | If hint is correct, O(log n) otherwise |
insert() (range) |
O(m × log(n + m)) | m = range size, n = current size |
emplace(), emplace_hint() |
O(log n) | Similar to insert, avoids copies |
erase() (by key) |
O(log n) | Tree deletion |
erase() (by iterator) |
O(1) amortized | If iterator is valid |
erase() (range) |
O(m + log n) | m = number of elements erased |
clear() |
O(n) | Destroys all elements |
| Operations | ||
size(), empty(), max_size() |
O(1) | Constant time |
swap() |
O(1) | Constant time, swaps root pointers |
merge() (C++17) |
O(n × log(m + n)) | n = source size, m = destination size |
| Comparison | ||
==, != |
O(n) | Element-wise comparison |
<, >, <=, >= |
O(n) | Lexicographic comparison |
Space Complexity
- Storage: O(n) where n is the number of elements
- Node overhead: Each node stores key, parent pointer, left child, right child, color (for red-black tree)
- Total: Typically ~40-48 bytes per element on 64-bit systems (including tree structure)
Tree Structure Impact
std::set is implemented as a balanced binary search tree (typically red-black tree):
- Height: O(log n) guaranteed (red-black tree property)
- Balance: Automatically maintained, no manual rebalancing needed
- Iterator stability: Iterators remain valid unless the element is erased
Comparison with Other Containers
| Operation | std::set |
std::vector |
std::unordered_set |
|---|---|---|---|
| Insert | O(log n) | O(1) amortized (end) | O(1) average |
| Find | O(log n) | O(n) | O(1) average |
| Erase | O(log n) | O(n) | O(1) average |
| Ordered iteration | ✅ Yes | ✅ Yes | ❌ No |
| Memory overhead | Higher | Lower | Higher |
Performance Tips Based on Complexity
- Use
emplace()for complex types → Avoids unnecessary copies (still O(log n)) - Provide hint for
insert()when possible → Can achieve O(1) amortized if hint is correct - Prefer
find()overcount() == 1→ Both O(log n), butfind()is more semantic - Use
lower_bound()/upper_bound()for range queries → O(log n) instead of O(n) iteration - Consider
std::unordered_set→ If ordering is not needed, O(1) average vs O(log n) - Avoid frequent insertions/deletions in tight loops → Each operation is O(log n)
Example: Efficient Range Queries
std::set<int> s = {1, 3, 5, 7, 9, 11, 13, 15};
// ❌ Inefficient: O(n) iteration
for (auto it = s.begin(); it != s.end(); ++it) {
if (*it >= 5 && *it <= 10) {
// Process element
}
}
// ✅ Efficient: O(log n) to find range, O(k) to iterate
auto lower = s.lower_bound(5); // O(log n)
auto upper = s.upper_bound(10); // O(log n)
for (auto it = lower; it != upper; ++it) {
// Process element - O(k) where k is range size
}
When to Use std::set
✅ Use std::set when:
- You need ordered, unique elements
- Frequent lookups (O(log n))
- Range queries are common
- Iterator stability is important
❌ Avoid std::set when:
- Ordering is not needed → Use
std::unordered_set(O(1) average) - Frequent random access → Use
std::vector(O(1)) - Very large datasets with simple types → Consider
std::unordered_set
9. Best Practices
- Check emptiness before dereferencing
begin()/rbegin(). - Prefer
emplacefor complex types to avoid extra copies. - Use
containswhen building with C++20+, otherwise usecount()orfind(). - Remember logarithmic complexity: avoid heavy per-element work inside tight loops if possible.
- For multi-set semantics, use
std::multiset(allows duplicates).
9. Summary
std::setoffers ordered, unique storage with log-time operations.contains()exists starting in C++20; otherwise usecount()/find().- There is no
top()/back()becausestd::setis not a sequence container; use iterators (begin(),rbegin()) to access extremes. - Iterators are bidirectional and remain valid unless their element is erased.
Understanding these constraints lets you choose the right container (set vs vector vs priority queue) and apply std::set effectively in production C++.